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We present here a review of the fundamental topics of Hartree-Fock theory in Quantum 
Chemistry. From the molecular Hamiltonian, using and discussing the Born-Oppenheimer ap- 
proximation, we arrive to the Hartree and Hartree-Fock equations for the electronic problem. 
Special emphasis is placed in the most relevant mathematical aspects of the theoretical deriva- 
tion of the final equations, as well as in the results regarding the existence and uniqueness 
of their solutions. All Hartree-Fock versions with different spin restrictions are systematically 
extracted from the general case, thus providing a unifying framework. Then, the discretization 
of the one-electron orbitals space is reviewed and the Roothaan-Hall formalism introduced. 
This leads to a exposition of the basic underlying concepts related to the construction and 
selection of Gaussian basis sets, focusing in algorithmic efficiency issues. Finally, we close the 
review with a section in which the most relevant modern developments (specially those related 
to the design of linear-scaling methods) are commented and linked to the issues discussed. The 
whole work is intentionally introductory and rather self-contained, so that it may be useful 
for non experts that aim to use quantum chemical methods in interdisciplinary applications. 
Moreover, much material that is found scattered in the literature has been put together here 
to facilitate comprehension and to serve as a handy reference. 
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1 Introduction 

In the hot field of computer simulation of biological macromolecules, available po- 
tential energy functions are often not accurate enough to properly describe complex 
processes such as the folding of proteins [1-7]. In order to improve the situation, it is 
convenient to extract ab initio information from quantum mechanical calculations 
with the hope of being able to devise less computationally demanding methods 
that can be used to tackle large systems. In this spirit, the effective potential for 
the nuclei calculated in the non-relativistic Born-Oppenheimer approximation is 
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typically considered as a good reference to assess the accuracy of cheaper poten- 
tials [8-14]. The study of molecules at this level of theoretical detail and the design 
of computationally efficient approximations for solving the demanding equations 
that appear constitute the major part of the field called quantum chemistry [15,16]. 
In this work, we voluntarily circumscribe ourselves to the basic formalism needed 
for the ground-state quantum chemical calculations that are typically performed 
in this context. For more general expositions, we refer the reader to any of the 
thorough accounts in refs. [17-19]. 

In sec. [2j we introduce the molecular Hamiltonian and a special set of units (the 
atomic ones) that are convenient to simplify the equations. In sec. [3l we present 
in an axiomatic way the concepts and expressions related to the separation of the 
electronic and nuclear problems in the Born-Oppenheimer scheme. In sec. 01 we in- 
troduce the variational method that underlies the derivation of the basic equations 
of the Hartree and Hartree-Fock approximations, discussed in sec. [6] and [7] respec- 
tively. The computational implementation of the Hartree-Fock approximation is 
tackled in sec. El where the celebrated Roothaan-Hall equations are derived. In 
sec. [H the main issues related to the construction and selection of Gaussian basis 
sets are discussed, and, finally, in sec. [TUJ, the hottest areas of modern research are 
briefly reviewed and linked to the issues in the rest of the work, with a special 
emphasis in the development of linear-scaling methods. 

2 Molecular Hamiltonian and atomic units 

Since 1960, the international scientific community has agreed on an 'official' set of 
basic units for measurements: Le Systeme International d'Unites, or SI for short 
(see http://www.bipm.org/en/si/ and ref. [20]). The meter (m), the kilogram 
(kg), the second (s), the ampere (A), the kelvin (K), the mole (mol), the joule (J) 
and the pascal (Pa) are examples of SI units. 



Sticking to the SI scheme, the non-relativistic quantum mechanical Hamiltonian 
operator of a molecule consisting of Njy nuclei (with atomic numbers Z a and masses 
Mn, a = 1, . . . , -/Vjv) and N electrons (i.e., the molecular Hamiltonian) is expressed 




where h stands for h/2n, being h Planck's constant, m e denotes the electron 
mass, e the proton charge, the position of the i-th electron, R a that of the a-th 
nucleus, eo the vacuum permittivity and V? the Laplacian operator with respect 
to the coordinates of the i-th particle. 

Although using a common set of units presents obvious communicative advan- 
tages, when circumscribed to a particular field of science, it is common to appeal to 
non-SI units in order to simplify the most frequently used equations by getting rid 
of some constant factors that always appear grouped in the same ways and, thus, 



1 Note that the non-relativistic molecular Hamiltonian does not depend on spin-like variables. 
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Tabic 1. Atomic units up to five significant digits. Taken from the 
National Institute of Standards and Technology (NIST) web page at 
http://physics.nist.gov/cuu/Constants/. Note that only four independent units 
are required in a mechanical-plus-clcctromagnctic system. The rest of them can be 
easily obtained from any such four. For example, using the units in the table, the 
relations h — 1 and l/(47reo) — 1 result. 

Unit of mass: mass of the electron = m e = 9.1094 ■ 10~ 31 kg 
Unit of charge: charge on the proton = e = 1.6022 ■ 10~ 19 C 

Unit of length: 1 bohr = a = 4?re Pjf = 0.52918 A= 5.2918 ■ 10" 11 m 
Unit of energy: 1 hartree = = 627.51 kcal/mol = 4.3597 - 10 -18 J 



Table 2. Energy units conversion factors to five significant digits. Taken from the National Insti- 
tute of Standards and Technology (NIST) web page at http : //physics .nist . gov/ cuu/Constants/. 
The table must be read by rows. For example, the value 4.1838, in the third row, fourth column, 
indicates that 1 kcal/mol — 4.1838 kj/mol. 





1 hartree 




1 eV 
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1 kj/mol 


1 cm 1 
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2.8592 ■ 10" 3 


1.1962 ■ 10" 2 


1 



make the numerical values in any calculation of the order of unity. In the field of 
quantum chemistry, atomic units (see table [1]), proposed in ref. [21] and named in 
ref. [22], are typically used. In these units, eq. ([!]) is substantially simplified to 



N N N 

a=l i=l ct^fi 

N N N 

^^ \R a -r l \ 2 \ rj - Ti 

i=l a=l i^tj J 



Since all the relevant expressions in quantum chemistry are derived in one way 
or another from the molecular Hamiltonian, the simplification brought up by the 
use of atomic units propagates to the whole formalism. Consequently, they shall 
be the choice all throughout this work. 

Apart from the atomic units and the SI ones, there are some other miscellaneous 
units that are often used in the literature: the angstrom, which is a unit of length 
defined as 1 A = 10~ 10 m, and the units of energy cm -1 (which reminds about the 
spectroscopic origins of quantum chemistry and, even, quantum mechanics), elec- 
tronvolt (eV), kilocalorie per mole (kcal/mol) and kilo joule per mole (kj/mol). The 
last two are specially used in the field of macromolecular simulations and quantify 
the energy of a mole of entities; for example, if one asserts that the torsional barrier 
height for H2O2 is ~ 7 kcal/mol, one is really saying that, in order to make a mole 
of H 2 2 (i.e., A^a ^ 6.0221 • 10 23 molecules) rotate 180° around the 0-0 bond, 
one must spend ~ 7 kcal. For the conversion factors between the different energy 
units, see table El 

Finally, to close this section, we rewrite eq. ((2j) introducing some self-explanatory 
notation that will be used in the subsequent discussion: 
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H = T N + T e + V NN + V eN + V ee , (3a) 

N N 
N 



2 

i=l 



(3c) 

V NN := \ £ %^ , (3d) 



N N N 



-TLi- 



3 The Born-Oppenheimer approximation 

To think of a macromolecule as a set of quantum objects described by a wavefunc- 
tion ^>(Xi, . . . , Xn n , x\, . . . , xn) dependent on the spatial and spirQ degrees of 
freedom, Xi := (r», (Tj), of the electrons and on those of the nuclei, X a := (R a , S a ), 
would be too much for the imagination of physicists and chemists. All the lan- 
guage of chemistry would have to be remade and simple sentences in textbooks, 
such as "rotation about this single bond allows the molecule to avoid steric clashes 
between atoms" or even "a polymer is a long chain-like molecule composed of 
repeating monomer units", would have to be translated into long and counter- 
intuitive statements involving probability and 'quantum jargon'. Conscious or not, 
we think of molecules as classical objects. 

More precisely, we are ready to accept that electrons are quantum (we know 
of the interference experiments, electrons are light, we are accustomed to draw 
atomic 'orbitals', etc.), however, we are reluctant to concede the same status to 
nuclei. Nuclei are heavier than electrons (at least ~ 2000 times heavier, in the case 
of the single proton nucleus of hydrogen) and we picture them in our imagination 
as 'classical things' that move, bond to each other, rotate around bonds and are at 
precise points at precise times. We imagine nuclei 'slowly moving' in the field of the 
electrons, which, for each position of the first, immediately 'adjust their quantum 
state'. 

The formalization of these ideas is called Born-Oppenheimer (BO) approximation 
[23,24] and the confirmation of its being good for many relevant problems is a fact 
that supports our intuitions about the topic and that lies at the foundations of the 
vast majority of the images, the concepts and the research in quantum chemistr}H. 



1 One convenient way of thinking about functions that depend on spin-like variables is as an m-tuplc 
of ordinary K 3JV functions, where m is the finite number of possible values of the spin. In the case of a 
one-particular wavefunction describing an electron, for example, a can take two values (say, —1/2 and 1/2) 

in such a way that one may picture any general spin-orbital ^i(x) as a 2-tuple (*~ 1/2 (r)>*i /2 0))- Of 
course, another valid way of imagining ^i(x) is simply as a function of four variables, three real and one 
discrete. 

2 There are many phenomena, however, in which the Born-Oppenheimer approximation is broken. For 
example, in striking a flint to create a spark, mechanical motion of the nuclei excites electrons into a 



February 1, 2008 



5:24 Molecular Physics introSCF 



A review of Hartree-Fock SCF 



5 



Like any approximation, the Born-Oppenheimer one may be either derived from 
the exact problem (in this case, the entangled behaviour of electrons and nuclei as 
the same quantum object) or simply proposed on the basis of physical intuition, 
and later confirmed to be good enough (or not) by comparison with the exact 
theory or with the experiment. Of course, if it is possible, the first way should 
be preferred, since it allows to develop a deeper insight about the terms we are 
neglecting and the specific details that we will miss. However, although in virtually 
every quantum chemistry book [18, 19, 26-28] hand-waving derivations up to dif- 
ferent levels of detail are performed and the Born-Oppenheimer approximation is 
typically presented as unproblematic, it seems that the fine mathematical details on 
which these 'standard' approaches are based are far from clear [29-31]. This state 
of affairs does not imply that the final equations that will need to be solved are 
ill-defined or that the numerical methods based on the theory are unstable; in fact, 
it is just the contrary (see the discussion below), because the problems are related 
only to the precise relation between the concepts in the whole theory and those 
in its simplified version. Nevertheless, the many subtleties involved in a derivation 
of the Born-Oppenheimer approximation scheme from the exact equations suggest 
that the second way, that of proposing the approximation, be taken. Hence, in the 
following paragraphs, an axiomatic presentation of the main expressions, aimed 
mostly to fix the notation and to introduce the language, will be performed. 

First of all, if we examine the Hamiltonian operator in eq. ([2]), we see that the 
term V e M prevents the problem from being separable in the nuclear and electronic 
coordinates, i.e., if we define x := {x\, . . . , xn) as the set of all electronic coordinates 
(spatial and spin-like) and do likewise with the nuclear coordinates X, the term V e N 
prevents any wavefunction \&(X, x) solution of the time-independent Schrddinger 
equation, 



from being written as a product, ^(X, x) = ^ r Jv(X)^ r e (x), of an electronic wave- 
function and a nuclear one. If this were the case, the problem would still be difficult 
(because of the Coulomb terms Vnn and V ee ), but we would be able to focus on 
the electrons and on the nuclei separately. 

The starting point for the Born-Oppenheimer approximation consists in assuming 
that a less strict separability is achieved, in such a way that, for a pair of suitably 
chosen and ^efeX), an Y wavefunction solution of eq. Q (or at least those 

in which we are interested; for example, the eigenstates corresponding to the lowest 
lying eigenvalues) can be expressed as 



where we have used a ';' to separate the two sets of variables in the electronic 
part of the wavefunction in order to indicate that, in what follows, it is convenient 
to use the image that 'from the point of view of the electrons, the nuclear degrees 
of freedom are fixed', so that the electronic wavefunction depends 'parametrically' 
on them. In other words, that the X are not quantum variables in eq. © below. 
Of course, it is just a 'semantic' semicolon; if anyone feels uncomfortable about it, 
she may drop it and write a normal comma. 



fr*Qc,x) 



f N + f e + V NN + V eN + V ee ) *(X, x) = E *(X, x) 



(4) 




(5) 



plasma that then emits light [25] . 
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Notably, in ref. [32], Hunter showed that any solution of the Schrodinger equation 
can in fact be written exactly in the form of eq. ([5]), and that the two functions, 
VPjvQQ and VPefeX), into which \P(X,x) is split may be interpreted as marginal 
and conditional probability amplitudes respectively. However, despite the insight 
that is gained from this treatment, it is of no practical value, since the knowledge 
of the exact solution ^(X, x) i s required in order to compute Vl/jvQO and ^efeX) 
in Hunter's approach. 

In the Born-Oppenheimer scheme, an additional assumption is made in order to 
avoid this drawback: the equations obeyed by the electronic and nuclear parts of 
the wavefunction are supposed to be known. Hence, ^efeX) is assumed to be a 
solution of the time-independent clamped nuclei Schrodinger equation, 

(f e + V eN (i;R) + Vee{L)) *efe E) := H e (K) * e (x; R) = E e (R) (s R) , (6) 

where the electronic Hamiltonian operator H e (R) and the electronic energy 
E e (R) (both dependent on the nuclei positions) have been defined, and, since the 
nuclear spins do not enter the expression, we have explicitly indicated that ^> e 
depends parametrically on R and not on X. 

The common interpretation of the clamped nuclei equation is, as we have ad- 
vanced at the beginning of the section, that the nuclei are much 'slower' than the 
electrons and, therefore, the latter can automatically adjust their quantum state 
to the instantaneous positions of the former. Physically, eq. ([6]) is just the time- 
independent Schrodinger equation of N particles (the electrons) of mass m e and 
charge — e in the external electric field of Nn point charges (the nuclei) of size 
eZ a at locations R a . Mathematically, it is an eigenvalue problem that has been 
thoroughly studied in the literature and whose properties are well-known [33-38]. 
In particular, it can be shown that, in the case of neutral or positively charged 
molecules (i.e., with Z := Z a > N), the clamped nuclei equation has an infinite 
number of normalizable solutions in the discrete spectrum of H e (R) (bound- states) 
for every value of R [39,40]. 

These solutions must be regarded as the different electronic energy levels, and a 
further approximation that is typically made consists in, not only accepting that 
the electrons immediately 'follow' nuclear motion, but also that, for each value of 
the nuclear positions R, they are in the electronic ground-stat^E i-e., the one with 
the lower E e (R). 

Consequently, we define 

Ef (R) := E° e (R) • (7) 

to be the effective electronic field in which the nuclei move, in such a way that, 
once we have solved the problem in eq. ([6]) and know E®(R), the time-independent 
nuclear Schrodinger equation obeyed by ^at(X) is: 

(f N + V NN (R) + Ef(R)) *jv(X) := H N * N QC) = E N * N (X) , (8) 



1 This is customarily assumed in the literature and it is supported by the general fact that electronic 
degrees of freedom are typically more difficult to excite than nuclear ones. Hence, in the vast majority of 
the numerical implementations of the theory, only the electronic ground-state is sought. We will see this 
in the forecoming sections. 
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where the effective nuclear Hamiltonian Hn has been implicitly defined. 

Now, to close the section, we put together the main expressions of the Born- 
Oppenheimer approximation for quick reference and we discuss them in some more 
detail: 



H e {R) * e (x; R) := (f e + V eN (R) + F ee ) ^ e (x; R) = E e (R) ^ e (x;R) , (9a) 

Ef(R) := E° e (R) , (9b) 

H N V N QQ := (f N + V NN (R) + Ef (R)) V N (X) = E N V N (X) , (9c) 

#(x,X) ~ *°(x;R) *tv(X) , E~E N . (9d) 

To start, note that the above equations are written in the logical order in which 
they are imagined and used in any numerical calculation. First, we assume the nu- 
clei fixed at R and we (hopefully) solve the clamped nuclei electronic Schrodinger 
equation (eq. (|9"aj) ). obtaining the electronic ground-state ^(x, R) with its corre- 
sponding energy E%(R). Next, we repeat this procedure for all possible valued of R 
and end up with an hyper-surface E® (R) in R-space. Finally, we add this function 
to the analytical and easily computable Vnn(R) and find the effective potential 
that determines the nuclear motion: 



Vn (S) := Vnn{R) + E° e (R) . (10) 

It is, precisely, this effective potential that is called Potential Energy Surface 
(PES) (or, more generally, Potential Energy Hyper-Surface (PEHS)) in quantum 
chemistry and that is the central object through which scientists picture chemical 
reactions or conformational changes of macromolecules [41]. In fact, the concept 
is so appealing and the classical image so strong that, after 'going quantum', we 
can 'go classical' back again and think of nuclei as perfectly classical particles that 
move in the classical potential V^(R). In such a case, we would not have to solve 
eq. (j9c|) but, instead, integrate the Newtonian equations of motion. This is the basic 
assumption of every typical force field used for classical ground-state molecular 
dynamics, such as the ones in the popular CHARMM [42,43], AMBER [44-46] or 
OPLS [47] packages. 

Finally, we would like to remind the reader that, despite the hand- waving char- 
acter of the arguments presented, up to this point, every computational step has a 
clear description and eqs. (f9a|) through ([9cj) could be considered as definitions in- 
volving a certain degree of notational abuse. To assume that the quantities obtained 
through this process are close to those that proceed from a rigorous solution of the 
time independent Schrodinger equation (eq. ([!])) is where the approximation re- 
ally lies. Hence, the more accurate eqs. (|9dp are, the better the Born-Oppenheimer 
guess is, and, like any other one, if one does not trust in the heuristic grounds 
on which the final equations stand, they may be taken as axiomatic and judged a 
posteriori according to their results in particular casea^l 



1 Of course, this cannot be done in practice. Due to the finite character of available computational resources, 
what is customarily done is to define a 'grid' in R-space and compute E® (R) in a finite number of points. 

2 Until now, two approximations have been done: the non-relativistic character of the objects studied 
and the Born-Oppenheimer approximation. In the forecoming, many more will be done. The a priori 
quantification of their goodness in large molecules is a formidable task and, despite the efforts in this 
direction, in the end, the comparison with experimental data is the only sound method for validation. 
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In quantum chemistry, the Born-Oppenheimer approximation is assumed in a 
great fraction of the studies and it allows the central concept of potential energy 
surface to be well-defined, apart from considerably simplifying the calculations. 
The same decision is taken in this work. 



4 The variational method 

There exists a mathematically appealing way of deriving the time independent 
Schrodinger equation (eq. from an extremal principle. To this end, we first 
define the functional (see appendix A) that corresponds to the expected value of 
the energy, 

:= (*\ H \V) , (11) 
where the traditional bra and ket notation is read as 



(*|6|tf) := J ^*{x)d^(x)dx , (12) 

O being any operator in the space of wavefunctions and x a dummy variable 
representing all possible coordinates on which depends. The norm of \£, in this 
notation is expressed as ( 1 I / | V I / ) = / ^(x)! 2 da;, and we shall say that Vl/ is normalized 
if (tf = 1. 

If we want to optimize the energy functional above restricting the search space to 
the normalized wavefunctions, the constrained-extremals problem that results can 
be solved via the Lagrange multipliers method (see appendix B) by constructing 
the associated functional J-[^f], where we introduce a Lagrange multiplier A to force 
normalization: 

:= F[% + A |tf > - l) . (13) 

If we now ask that the functional derivative of J~\}&\ with respect to the complex 
conjugate ty* of the wave functional be zero, i.e., we look for the stationary points of 
J-[&] conditioned by (^l^) = 1, we obtain the eigenvalues equation for H, i.e., the 
time-independent Schrodinger equation. Additionally, it can be shown, first, that, 
due to the self-adjointedness of H, the equation obtained from the stationarity 
condition with respect to (not *&*) is just the complex conjugate and adds no 
new information. 

Moreover, one can see that the reverse implication is also true [48], so that, if 
a given normalized wavefunction ^ is a solution of the eigenvalue problem and 
belongs to the discrete spectrum of H, then the functional in eq. (fTB"|) is stationary 
with respect to 



1 A function of a complex variable z (or, analogously, a functional on a space of complex functions) may 
be regarded as depending on two different sets of independent variables: either Re(^) and Im(,z) or z and 
z* . The choice frequently depending on technical issues. 
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^3HI = = -A* := E$> and (*|*) = 1 . (14) 

This result, despite its conceptual interest, is of little practical use, because it 
does not indicate an operative way to solve the Schrodinger equation different 
from the ones that we already knew. The equivalence above simply illustrates 
that mathematical variational principles are over-arching theoretical statements 
from which the differential equations that actually contain the details of physical 
systems can be extracted. Nevertheless, using similar ideas, we will derive another 
simple theorem which is indeed powerfully practical: the Variational Theorem. 

Let {^ n } be a basis of eigenstates of the Hamiltonian operator H and {E n } 
their corresponding eigenvalues. Since H is self-adjoint, the eigenstates can be 
chosen to be orthonormal (i.e., (^ m \^ n ) = S mn ) and any normalized wavefunction 

in the Hilbert space can be written as a linear combination of themE): 



I*) = Yl C n\^n) provided that ^ \C n \ 2 = 1 . (15) 

n n 

If we now denote by Eq the lowest E n (i.e., the energy of the ground-statejl and 
calculate the expected value of the energy on an arbitrary state ^ such as the one 
in eq. (fT5|) . we obtain 

(#| H = C* m C n {y m \ H\^ n ) = Y^ C* m C n E n (y m \V n ) 

m,n m,n 
= ^2 C^CnEnSmn = ^ \C n \ 2 E n > ^ \C n \ 2 Eq = Eq . (16) 



This simple relation is the Variational Theorem and it states that any wavefunc- 
tion of the Hilbert space has an energy larger than the one of the ground-state (the 
equality can only be achieved if = ^o). However trivial this fact may appear, 
it allows a very fruitful 'everything-goes' strategy when trying to approximate the 
ground-state in a difficult problem. If one has a procedure for finding a promis- 
ing guess wavefunction (called variational ansatz), no matter how heuristic, semi- 
empirical or intuitive it may be, one may expect that the lower the corresponding 
energy, the closer to the ground-state it isj This provides a systematic strategy 
for improving the test wavefunction which may take a number of particular forms. 

One example of the application of the Variational Theorem is to propose a fam- 
ily of normalized wavefunctions parametrically dependent on a number 9 and 



1 We assume here, for the sake of simplicity and in order to highlight the relevant concepts, that H has 
only discrete spectrum. The ideas involved in a general derivation are the same, but the technical details 
and the notation are more complicated [49]. 

2 Its existence is not guaranteed: it depends on the particular potential in H . However, for the physically 
relevant cases, there is indeed a minimum energy in the set {E n }. 

3 Of course, this not necessarily so (and, in any case, it depends on the definition of 'closer'), since it could 
happen that the (ty\H\ty) landscape in the constrained subset of the Hilbert space in which the search is 
performed be 'rugged'. In such a case, we may have very different wavefunctions (say, in the sense of the 
L 2 -norm) with similar energies (^\H\^/). The only 'direction' in which one can be sure that the situation 
improves when using the variational procedure is the (very important) energetic one. That one is also 
moving towards better values of any other observable is, in general, no more than a bona fide assumption. 
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calculate the ^-dependent expected value of the energjQ: 

E(0) :={* \H\* ) . (17) 

Then, one may use the typical tools of one- variable calculus to find the minimum 
of E(0) and thus make the best guess of the energy Eq constrained to the family 
^>q. If the ansatz is cleverly chosen, this estimate could be rather accurate, however, 
for large systems that lack symmetry, it is very difficult to write a good enough 
form for ^g. 

When dealing with a large number of particles, there exists another protocol 
based on the Variational Theorem that will permit us to derive the Hartree and 
Hartree-Fock equations for the electronic wavefunction \E' e (see sees. [6] and re- 
spectively). The first step is to devise a restricted way (a function / with no free 
parameters) to express ^ e in terms of one-electron wavefunctions, also called or- 
bitals and denoted^ by {if) a (x)}, thus reducing the search space to a (typically 
small) subset of the whole Hilbert space: 



* e ( Xl ,...,x N ) = f(iMxi)}) ■ (18) 

The second step consists in establishing a (possibly infinite) number of con- 
straints on the one-electron functional, 



L k ({M*i)})=0- (19) 

With these two ingredients, we can now write the Lagrange functional that de- 
scribes the constrained problem in terms of the orbitals ip a (see eq. (I13D ): 



/({^a(x i )})) + ^A fc L fc ({^ a (x i )}) • (20) 

k 

Finally, we take the derivatives of •7 r [{V'a}] with respect to every ip a (x) (normally, 
with respect to the complex conjugate ipl(x), see footnote CD in page [8]) and we ask 
each one to be zero (see appendix A). This produces the final equations that must 
be solved in order to find the stationary one-electron orbitals. 

Of course, these final equations may have multiple solutions. In the cases dis- 
cussed in this work, there exist procedures to check that a particular solution 
(found computationally) is, not only stationary, but also minimal [50]. However, to 
assure that it is, not only locally minimal, but also globally (i.e., that is optimal), 
could be, in general, as difficult as for any other multi-dimensional optimization 
problem [51-53]. In the Hartree and Hartree-Fock cases, discussed in sees. [6] and [7J 



F[m] = (f({M^)}) 



H 



4 Note that, if the functions \&g were not normalized, then we should deal with the constrained problem 
as in H13I I, or, equivalently, we could include a dividing overlap term {^g\^g) m G3 - 

In principle, there could be more orbitals than electrons, however, in both the Hartree and Hartree-Fock 
applications of this formalism, the index a runs, just like i, from 1 to N. 

2 Actually, both restrictions (the one at the level of the total wavefunction in eq. dl8l and the one 
involving the one-particle ones in eq. 11191 are simply constraints (see appendix B). The distinction is not 
fundamental but operative, and it also helps us to devise variational ansatzs separating the two conceptual 
playgrounds. 
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respectively, the aufbau principle and a clever choice of the starting guess constitute 
particular techniques intended to alleviate this problem. 



5 Statement of the problem 

Assuming the Born-Oppenheimer approximation (see sec. Eland eqs. ([!])), the cen- 
tral problem that one must solve in quantum chemistry is to find the ground-state 
of the electronic Hamiltonian for a fixed position R of the nuclei 



N N N N 

^=T + ^ + Ke:=-£^-££f^E^- (21) 

i=l z i=i «=l ai i+j ^ 

As already remarked in sec.0 this problem is well posed for neutral and positively 
charged molecules, and, in the same way in which the term V e N prevented the total 
wavefunction to be a product of an electronic and a nuclear part, the term V ee in 
the expression above breaks the separability in the one-electron variables X{ of 
the electronic time-independent Schrodinger equation associated to H. Hence, a 
general solution ^(x) cannot be a product of orbitals and the search must be a 
priori performed in the whole Hilbert space. However, this is a much too big place 
to look for ^(x), since the computational requirements to solve the Schrodinger 
equation grow exponentially on the number of electrons. 

Partially recognizing this situation, in the first days of quantum mechanics, Dirac 
wrote that, 

The underlying physical laws necessary for the mathematical theory of a large part 
of physics and the whole of chemistry are thus completely known, and the difficulty 
is only that the exact application of these equations leads to equations much too 
complicated to be soluble. It therefore becomes desirable that approximate practical 
methods of applying quantum mechanics should be developed, which can lead to 
an explanation of the main features of complex atomic systems without too much 
computation [49]. 

The description of the most popular approximate methods, which the great physi- 
cist envisaged to be necessary, will be the objective of the following sections. Two 
basic points responsible of the relative success of such an enterprise are the severe 
reduction of the space in where the ground-state is sought (which, of course, leads 
to only an approximation of it) and the availability of computers unimaginably 
faster than anything that could be foreseen in times of Dirac. 



6 The Hartree approximation 

One of the first and most simple approximations aimed to solve the problem posed 
in the previous section is due to Hartree in 1927 [21] (although the way in which 
the Hartree equations will be derived here, using the Variational Theorem, is due 
to Slater [54]). In this approximation, the total wavefunction is constrained to be 
a product (typically referred to as Hartree product) of N one-electron orbitals (see 



1 Since, from now on, we will only be dealing with the 'electronic problem', the notation has been made 
simpler by dropping superfluous subindices e where there is no possible ambiguity. As a consequence, for 
example, the electronic Hamiltonian is now denoted by H, the electronic kinetic energy by T and the 
electronic wavefunction by *&(x) (dropping the parametric dependence on R in the same spirit). 
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eq. (jl8p ). where the spin of the electrons and the antisymmetry (i.e., the Pauli 
exclusion principle) are not taken into account^: 



N 



$(r 1 ,...,r JV ) = n<Mn) , (22) 



i=i 



where the a index in the orbitals has been substituted by i due to the fact that 
each function is paired to a specific set of electron coordinates, consequently being 
the same number of both of them. 

Also, the additional requirement that the one-particle wavefunctions be normal- 
ized is imposed (see eq. ([19]) ): 



1, i = l,...,N. (23) 



With these two ingredients, we can construct the auxiliary functional whose 
zero-derivative condition produces the solution of the constrained stationary points 
problem (see eq. (|20p ). To this effect, we introduce N Lagrange multipliers £j that 



force the normalization constraints^: 



N 



H 



N \ N 



where the minus sign in the Lagrange multipliers term is chosen in order to get 
to the most common form of the final equations without having to define new 
quantities 

This functional may be considered to depend on 2N independent functions: the 
N one-electron fa and their N complex conjugates (see footnote [T] in paged]). The 
Hartree equations are then obtained by imposing that the functional derivative of 
T with respect to (f)t be zero for k = 1, . . . ,N. In order to obtain them and as an 
appetizer for the slightly more complicated process in the more used Hartree-Fock 
approximation, the functional derivative will be here computed in detail following 
the steps indicated in appendix A. 

First, we write oulU the first term in the right-hand side of eq. ((21 



2 We shall denote with capital Greek letters the wavefunctions depending on all the electronic variables, 
and with lowercase Greek letters the one-electron orbitals. In addition, by Nl" (or ip), we shall indicate 
wavefunctions containing spin part (called spin- orbitals) and, by <I> (or <f>), those that depend only on 
spatial variables. 

1 Note that the normalization of the total wavefunction is a consequence of the normalization of the 
one-electron ones and needs not to be explicitly asked. 

2 The limits in sums and products are dropped if there is no possible ambiguity. 
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H 



-5E(n 



* \3& 



(r)V 2 cfii(r)dr 



-e nwi) 



/ fl(r)fc(r) £ 
\A=1 



dr 



#(r)&(r)#(r')^(r') 



r — V 



dr'dr , (25) 



where dr denotes the Euclidean IR 3 volume element dxdydz. 

Now, we realize that the products outside the integrals can be dropped using the 
constraints in eq. (|23l) (see the last paragraphs of appendix B for a justification that 
this can be done before taking the derivative). Then, using the previous expression 
and conveniently rearranging the order of the integrals and sums, we write out the 
first term in the numerator of the left-hand side of eq. (|Alj) that corresponds to an 
infinitesimal variation of the function <jfi : 



Fl<f>t + e6<f>t) := 

F [<h-> 0i, • • • , fa, <t>% + e <% ■■■An, 4>*n] = 
-^E/^)V 2 ^)dr-£ / lUrf^y^] dr 

+ ^E/ E ^'_^ /)|2 dr/ ) dr -E^(/ Wr)| 8 dr-l) 

Hl(rW 2 Mr) dr - e J 6f k (r) </> k (r) (f^ ^j^j l d ^ 
+ e / 5</,%(r)<f> k (r) ( f E 'f dr') dr-ee k f <^(r) fe (r) dr . (26) 



We subtract from this expression the quantity f [{0i(rj)}l , so that the first four 
terms cancel, and we can write 
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lim 

e^O 



r r 1 / ^ 

/[-iv^ W -(E 



+ 



A|2 



r-R, 



M r ) 



r — r' 



dr') 4> k (r) - e k <t>k(r) 



(r) dr 



(27) 



Now, by simple inspection of the right-hand side, we see that the functional 
derivative (see eq. (|A1|) ) is the part enclosed by square brackets: 



5? [{&}] 



-y 2 + V e (r) + V c k (r) - e k ) Mr) , 



(28) 



where the nuclear potential energy and the electronic potential energy have been 
respectively defined as3 



V N (r) 



N N 

E 



Z A 



A=l 



r-R A \ ' 



E 



i^k 



, r\\2 



dr' . 



r — V 



(29a) 



(29b) 



Finally, if we ask the functional derivative to be zero for k = 1, . . . , N, we arrive 
to the equations that the stationary points must satisfy, the Hartree equations: 



HM Mr) ■-- 



1 



V 2 + V N (r) + V e k (r) Mr) = e k Mr 



(30) 



for all k = 1,..., N. 

Let us note that, despite the fact that the object H k [<j)] defined above is not a 
operator strictly speaking, since, as the notation emphasizes, it depends on the 
orbitals 4>%f-ki we W1U stick to the name Hartree operator for it, in order to be 
consistent with most of the literature. 

Now, some remarks related to the Hartree equations are worth making. First, it 
can be shown that, if the variational ansatz in eq. (|22p included the spin degrees of 
freedom of the electrons, all the expressions above would be kept, simply changing 
the orbitals (j>i{rj) by the spin-orbitals ^(r^Cj). 

Secondly, and moving into more conceptual playgrounds, we note that the special 
structure of V k (r) in eq. (|29bj) makes it mandatory to interpret the Hartree scheme 
as one in which each electron 'feels' only the average effect of the rest. In fact, if the 
quantum charge density Pi(r) := |i^j(r)| 2 is regarded for a moment as a classical 
continuum distribution, then the potential produced by all the electrons but the 
fc-th is precisely the one in eq. (I29b|) . Supporting this image, note also the fact 



1 Compare the notation with the one in eqs. (J3J, here a subindex e has been dropped to distinguish the 
new objects defined. 
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that, if we write the joint probability density of electron 1 being at the point r%, 
electron 2 being at the point T2 and so on (simply squaring eq. (1221) ). 

N N 

p(r u ...,r N ) := \$(r u . . . ,r N )\ =JJ|0 i (r i )| =JJp i (r i ), (31) 

i=l i=l 

we see that, in a probabilistic sense, the electrons are independent (they could 
not be independent in a physical, complete sense, since we have already said that 
they 'see' each other in an average way). 

Anyway, despite these appealing images and also despite the fact that, dis- 
guised under the misleading (albeit common) notation, these equations seem 'one- 
particle', they are rather complicated from a mathematical point of view. On the 
one hand, it is true that, whereas the original electronic Schrodinger equation 
in (f9a|) depended on 3iV spatial variables, the expressions above only depend on 3. 
This is what we have gained from drastically reducing the search space to the set of 
Hartree products in eq. (|22p and what renders the approximation tractable. On the 
other hand, however, we have paid the price of greatly increasing the mathematical 
complexity of the expressions, so that, while the electronic Schrodinger equation 
was one linear differential equation, the Hartree ones in (|30p are N coupled non- 
linear integro-differential equations [55]. 

This complexity precludes any analytical approach to the problem and forces us 
to look for the solutions using less reliable iterative methods. Typically, in compu- 
tational studies, one proposes a starting guess for the set of N orbitals {0°}; with 
them, the Hartree operator 7ifc[^°] in the left-hand side of eq. (|30|) is constructed 
for every k and the N equations are solved as simple eigenvalue problems. For each 
k, the (f>\ that corresponds to the lowest e\ is selected and a new Hartree opera- 
tor Hki^ 1 ] is constructed with the {4>\}- The process is iterated until (hopefully) 
the n-th set of solutions {0^} differ from the (re — l)-th one less than a 

reasonably small amount. 

Many technical issues exist that raise doubts about the possible success of such 
an approach. The most important ones being related to the fact that a proper 
definition of the Hartree problem should be: find the global minimum of the energy 
functional ($|J5"|$) under the constraint that the wavefunction & be a Hartree 
product and not: solve the Hartree equations (|30p . whose solutions indeed include 
the global minimum sought but also all the rest of stationary points. 

While the possibility that a found solution be a maximum or a saddle point 
can be typically ruled out [50, 55], as we remarked in sec. H] and due to the fact 
that there are an infinite number of solutions to the Hartree equations [56], to be 
sure that any found minimum is the global one is impossible in a general case. 
There exists, however, one way, related to a theorem by Simon and Lieb [57,58], 
of hopefully biasing a particular found solution of the Hartree equations to be the 
global minimum that we are looking for. They showed that, first, for neutral or 
positively charged molecules (Z > N), the Hartree global minimization problem 
has a solution (its uniqueness is not established yet [55]) and, second, that the min- 
imizing orbitals {4>k} correspond to the lowest eigenvalues of the Ti-ki^ 1 ] operators 
self-consistently constructed with therrQ. Now, although the reverse of the second 



1 In quantum chemistry, where the number of electrons considered is typically small, the version of the 
Hartree equations that is used is the one derived here, with the Hartree operators depending on the index k 
in a non-trivial way. However, if the number of electrons is large enough (s uch a s in condensed matter 
applications), is customary to add to the effective electronic repulsion in eq. J29bt the self-interaction of 
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part of the theorem is not true in general (i.e., from the fact that a particular set 
of orbitals are the eigenstates corresponding to the lowest eigenvalues of the asso- 
ciated Hartree operators, does not necessarily follow that they are the ones that 
minimize the energy) [55], in practice, the insight provided by Lieb and Simon's 
result is invoked to build each successive state in the iterative procedure described 
above, choosing the lowest lying eigenstates each time. In this way, although one 
cannot be sure that the global minimum has been reached, the fact that the found 
one has a property that the former also presents is regarded as a strong hint that it 
must be so (see also the discussion for the Hartree- Fock case in the next section). 

This drawback and all the problems arising from the fact that an iterative pro- 
cedure such as the one described above could converge to a fixed point, oscillate 
eternally or even diverge, are circumvented in practice by a clever choice of the 
starting guess orbitals {<fi®}- If they are extracted, for example, from a slightly less 
accurate theory, one may expect that they could be 'in the basin of attraction' of 
the true Hartree minimum (so that the stationary point found will be the correct 
one) and close to it (so that the iterative procedure will converge). This kind of 
wishful thinking combined with large amounts of heuristic protocols born from 
many decades of trial-and-error-derived knowledge pervade and make possible the 
whole quantum chemistry discipline. 



7 The Hartree-Fock approximation 

The Hartree theory discussed in the previous section is not much used in quantum 
chemistry and many textbooks on the subject do not even mention it. Although it 
contains the seed of almost every concept underlying the Hartree-Fock approxima- 
tion discussed in this section, it lacks an ingredient that turns out to be essential 
to correctly describe the behaviour of molecular species: the indistinguishability of 
the electrons. This was noticed independently by Fock [59] and Slater [54] in 1930, 
and it was corrected by proposing a variational ansatz for the total wavefunction 
that takes the form of a so-called Slater determinant (see eq. (|33p below). 

The most important mathematical consequence of the indistinguishability among 
a set of N quantum objects of the same type is the requirement that the total N- 
particle wavefunction must either remain unchanged (symmetric) or change sign 
(antisymmetric) when any pair of coordinates, Xi and swapped. In the first 

case, the particles are called bosons and must have integer spin, while in the second 
case, they are called fermions and have semi-integer spin. Electrons are fermions, 
so the total wavefunction must be antisymmetric under the exchange of any pair 
of one-electron coordinates. This is a property that is certainly not met by the 
single Hartree product in eq. (122j) but that can be easily implemented by forming 
linear combinations of many of them. The trick is to add all the possible Hartree 
products that are obtained from eq. (122D changing the order of the orbitals labels 
while keeping the order of the coordinates onea3, and assigning to each term the 
sign of the permutation p needed to go from the natural order 1,...,N to the 
corresponding one p(l), ■ ■ ■ ,p(N). The sign of a permutation p is 1 if p can be 
written as a composition of an even number of two-element transpositions, and it 
is —1 if the number of transpositions needed is odd. Therefore, we define T(p) as 



electron k with himself. In such a case, the Hartree operator is independent of k so that, after having 
achieved self-consistency, the orbitals tp^ turn out to be eigenstates corresponding to different eigenvalues 
of the same Hermitian operator, "H [</>], and, therefore, mutually orthogonal. 

1 It is immaterial whether the orbitals labels are kept and the coordinates ones changed or vice versa. 
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the minimum numbeiH of transpositions needed to perform the permutation p, and 
we write the sign of p as (— Y) T ^ P \ 

Using this, an antisymmetric wavefunction constructed from Hartree products 
of N different orbitals may be written as 



,x N 



^2 (- 1 ) T(p V P (i)(^i) • • • %(n)(xn) , 



(32) 



where the factor l/y/Nl enforces normalization of the total wavefunction \& (if 
we use the constraints in eq. (|34[) ) and Sn denotes the symmetric group of order 
N, i.e., the set of all permutations of ./V elements (with a certain multiplication 
rule) . 

The above expression is more convenient to perform the calculations that lead 
to the Hartree-Fock equations, however, there is also a compact way of rewriting 
eq. (]32p which is commonly found in the literature and that is useful to illustrate 
some particular properties of the problem. It is the Slater determinant: 



1 



^n{x\) 
iPn(x 2 ) 



1pl(x N ) 1p2(xN) ■ ■ ■ 1pN(x N ) 



(33) 



Now, having established the constraints on the form of the total wavefunction, 
we ask the Hartree-Fock one-electron orbitals to be, not only normalized, like we 
did in the Hartree case, but also mutually orthogonal: 



(4>i\il>j 



hJ 



N 



(34) 



Additionally note that, contrarily to what we did in the previous section, we 
have now used one-electron wavefunctions tpi dependent also on the spin a (i.e., 
spin-orbitals) to construct the variational ansatz. A general spin-orbital^] may be 
written as (see also footnote Q] in page HJ) 



^(x) =^ Q (r)a(er) + (/(r)/?(GT) , 



(35) 



where the functions a and (3 correspond to the spin-up and spin-down eigenstates 
of the operator associated to the z-component of the one-electron spin. They are 
defined as 



a(-l/2) = /3(— 1/2) = 1 
a(l/2) = 1 0(1/2) = 



(36) 



2 It can be shown that the parity of all decompositions of p into products of elementary transpositions is 
the same. We have chosen the minimum only for T(p) to be well defined. 

1 Note that, if we had not included the spin degrees of freedom, the search space would have been half as 
large, since, where we now have 2N functions of r (i.e., ff(r) and ip^(r), with i = 1, . . . , N), we would 
have had just N (the <f>i(r)). 
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The formalism obtained when these general spin-orbitals are used is accordingly 
called General Hartree-Fock (GHF). The first part of the mathematical treatment 
in the following paragraphs shall be performed assuming this situation. The ad- 
vantage of such a choice is that, later on, by imposing additional constraints to the 
spin part of the one-electron orbitals, we will be able to derive the basic equations 
for some other flavours of the Hartree-Fock theory, such as UHF, RHF and ROHF, 
in a very direct way. 

Now, to calculate the expected value of the energy in a state such as the one in 
eqs. (|32j) and (j33l) . let us denote the one-particle part (that operates on the i-th 
coordinates) of the total electronic Hamiltonian H in eq. (|2ip by 



vy2 N n 

4-E 



a=l 



R a V{ I 



(37) 



in such a way that, 



(*i h i*) = k i*) + \ —m, (38) 

where r« := \rj — fj|. 

We shall compute separately each one of the sums in the expression above. Let 
us start now with the first one: For a given i in the sum, and due to the structure 
of the electronic wavefunction in (f3"2j) . the expected value (^| hi I 1 !') is a sum of 
(AM) 2 terms of the form 



— (-l) r(p)+r(p ' ) (V'p(i)(xi) • ■•iP p (n)(%n) I hi | Vy(i)(zi) • ■■^p'(n)(xn)) , (39) 

but, since hi operates only on X{ and due to the orthogonality of the spin-orbitals 
with different indices, we have that the only non-zero terms are those with p = p'. 
Taking this into account, all permutations p appear still as terms of the sum, and 
we see that every orbital tfjj occurs depending on every coordinate x%. Given a 
particular pair % and j, this happens in the terms for which p(i) = j and one of 
such terms may be expressed as 



4 (il^fclV'fc)] tyjlhWi) , (40) 

where we have used that (— l) 2 ^^) = \ } and we have dropped the index i from hi 
noticing that the integration variables in (ipj(xi) \ hi \tpj(xi)) are actually dummy. 

Next, we use again the one-electron wavefunctions constraints in eq. (|34h to 
remove the product of norms in brackets, and we realize that, for each j, there 
are as many terms like the one in the expression above as permutations of the 
remaining N — 1 orbital indices (i.e., (N — 1)!). In addition, we recall that every j 
must appear and perform the first sum in eq. (|38p . yielding 
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X>i fa m = E(^ - x ) ! E m^-i * hM = 

^ " X ) ! E M^il & 1^) = E^'l h > (41) 
where all factorial terms have canceled out. 

The next step is to calculate the second sum in eq. ([38]) , Again, we have that, 
for each pair (^\ 1/rij \^f?) is a sum of (AH) 2 terms like 

1 (_i)^(p)+^')(^ (1)(a;i ) . . . 4, p{N) { XN ) | _L | Vy ( i)(*i) • • • ^ ( iv)(^)> • (42) 
i\ . r % j 

For this expected value, contrarily to the case of hi and due to the two- 
body nature of the operator not only do the terms with p = p' sur- 
vive, but also those in which p and p' differ over only a pair of values i and 
j, i.e., those for which p(i) = p'(j), p(j) = p'(i) and p(k) = p'(k),\/k ^ 
The reason for this is that, even if p(i) / p'(i) and p(j) ^ p'(j), the integral 
{^p(i){xi)^ P {j){xj)\ l/nj \i> p >(i)(xi)ij} pl y)(xj)) does not vanish. 

Now, using that operates only on Xi and Xj, the orthonormality conditions 

in eq. (|34h and the fact that (— l) 2 - 7 "^) = 1, we have that, when ijjk depends on X{ 
and ipi depends on Xj, the p = p' part of the corresponding terms in eq. (|42ft reads 

\\i>ki>i) , (43) 

where we have defined 

mi \ ■■= E// ^'"y'' 1 * 1 " ^ . (44) 

cr, cr' 

Next, we see that, for each pair (k, I), and keeping p = p', we can make (N — 2)! 
permutations among the N — 2 indices of the orbitals on which does not 

operate and still find the same expression (|43p . Therefore, for each pair (k,l), we 
have a sum of (iV — 2)! identical terms. In addition, if we perform the sum on i 
and j in eq. ([38]) and remark that the term in eq. ([4*3]) does not depend on the 
pair (which is obvious from the suggestive notation above), we have that the 
p = p' part of the second sum in eq. ([38]) , which is typically called Coulomb energy, 
reads 

\ E(iV - 2)! E ^kH - \Mi) = \ E<^l \ \Mi) , (45) 

i^j k^l ' kjtl 

where we have used that the sum i s performed on N(N — 1) identical terms 
which do not depend neither on i nor on j. 

On the other hand, in the case in which p and p' only differ in that the indices of 
the orbitals that depend on X{ and Xj are swapped, all the derivation above applies 



February 1, 2008 5:24 Molecular Physics introSCF 



20 Pablo Echenique and J. L. Alonso 

except for the facts that, first, (— 1) t (p)+^~(p ) = — 1 and, second, the indices k and 
/ must be exchanged in eq. (I45p (it is immaterial if they are exchanged in the bra 
or in the ket, since the indices are summed over and are dummy). Henceforth, the 
remaining part of the second sum in eq. (|38h . typically termed exchange energy, 
may be written as 



-^(^il-hMfc) ■ (46) 

k^l 



Finally, the expected value of the energy in the GHF variational state turns 
out to be 



^GHF 




- \tpiipj) - {4>iipj\ - \tpjipi) ) , ( I") 



where the one-electron integrals hi have been defined together with the two- 
electron integrals, JJy and K^, and the fact that J a = Ka,\/i has been used to 
include the diagonal terms in the second sum. 

Now, the energy functional above is the quantity that we want to minimize under 
the orthonormality constraints in eq. (|34p . So we are prepared to write the auxiliary 
functional J- , introducing N 2 Lagrange multipliers A™ (see eq. (I20p and compare 
with the Hartree example in the previous section): 



In order to get to the Hartree-Fock equations that the^ stationary orbitals ipk 
must satisfy, we impose that the functional derivative of J- [{V'i}] with respect to 
ipk be zero. To calculate SF/Sifi^, we follow the procedure described in appendix A, 
using the same notation as in eq. (|26p . The variation with respect to each ipt shall 
yield the Hartree-Fock equations for thejrnconjugated Vi- The equations for the 
ipf are obtained either by differentiating T [{V'i}] w ith respect to each ip^ or, if the 
Ay matrix is Hermitian (which will turn out to be the case), by simply taking the 
complex conjugate of both sides of the final equations in (J53~|) , 

Now, 
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lim 

e^O 



(5ip k \ h \ip k ) + ^2 ( WfcV'il - l^ktpj) ~ {Hk^j\ ~ \^j^k)j ~ ^2 Xkj(S^k\^j 



hM*) + Y.(^[ dx ' ~ ^ ( x ) 



r — r 



dx 



r — r 



5tpl(x) dx , 



(49) 



where we have used the more compact notation J dx instead of J dr. 

Then, like in the previous section, by simple inspection of the right-hand side, 
we see that the functional derivative is the part enclosed by square brackets (see 
eq. {AT])): 



i>k{x) ~ ^2^kj1pj{3 



(50) 



where the Coulomb and exchange operators are respectively defined by their 
action on an arbitrary function ip(x) as follows^: 



Jj[^]ip(x) :-- 



|^(x')| 2 



r — r 



j7- dx' J (p(x) 



KM<P(?) ■= ( / 3 1 n dx> UPj(x) . 



\r — r 



(51a) 
(51b) 



Therefore, if we define the GHF Fock operator as 



(52) 



we arrive to a first version of the Hartree-Fock equations by asking that the 
functional derivative in eq. (I50j) be zero: 



F GHF [ip] tPiix) = ]T ^(x) , i = 1, . . . ,N . 



(53) 



Now, in order to obtain a simpler version of them, we shall take profit from the 
fact that the whole problem is invariant under a unitary transformation among the 
one-electron orbitals. 



1 Like in the Hartree case in the previous section, the word operator is a common notational abuse if they 
act upon the very ipi on which they depend. This is again made explicit in the notation. 
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If we repeat the calculation in eq. ({490 but varying this time, instead of 
and use the following relation: 



J ip* (x)hipj(x)dx = J [hip*(x)]ipj(x)dx , (54) 
we arrive to the GHF equations for the conjugated spin-orbitals: 



F GHF [VM(*) = 5>^(x) • i = h---,N . (55) 

3 

Then, we may subtract the complex conjugate of eq. ({550 from eq. ({530 yielding 



^ (Xij - X*i) if>j(x)=0, k = l,...,N. (56) 
j 

Therefore, since the set of the ipj is orthogonal and hence linearly independent, 
we have that the N x N matrix A := (Ay) of Lagrange multipliers is Hermitian: 



Ay = A£, k,j = l,...,N. (57) 

This actually means that we have a set of three equations that the stationary 
spin-orbitals satisfy, but only two of them are independent. These equations are 
the GHF equations for the ipi and ijjf, in (f53j) and (f55|) . respectively, and ({57]) . Any 
pair of them could be in principle be chosen as the basic equations, however, in 
common practice the first and the last one of them are typically picked. 

In any case, due to ({570 . a unitary matrix U exists that diagonalizes A; in the 
sense that e := U~ 1 AU = U + MJ is a diagonal matrix, i.e., ey = <5y£j. Using this 
unitary matrix U, we can transform the set of orbitals {ipi} into a new one {vp^}: 



^ k {x)=Y,U kj ^{x) . (58) 

j 

This transformation is physically legitimate since it only changes the A^-electron 
wavefunction ^ in an unmeasurable phase e 1 ^ . To see this, let us denote by Sy 
the (ij)-element of the matrix inside the Slater determinant in eq. ({330 . i.e., Sy := 
tjjj(xi). Then, after using the expression above, the (r/)-element of the new matrix 
S' can be related to the old ones via S k i = Ylj Ukj^lji m such a way that S = S'U T 
and the desired result follows: 



.. n detS det (S'U T ) detS'detU T i( * T / r/m 
*({&}) = -7= = = T^tt = e MM) 



(59) 



Now, we insert eq. ([580 into the first version of the Hartree-Fock equations in ({530 : 
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F GHF [^'] I U ^j( x ) I = E WiMx) > i = l,...,N. (60) 

\ 3 J 3,k 

Next, we multiply by U^ 1 each one of the N expressions and sum in i: 

( \ 



F GHF [Ui>'} 



= E2i^^5^( x ) 

V ' $lj J ,h £lk = &lk e l 

F ghf [UtP>] $(x) = ert[{x) , I = 1, . . . ,N . (61) 



Although this new version of the Hartree-Fock equations can be readily seen as a 
pseudo-eigenvalue problem and solved by the customary iterative methods, we can 
go a step further and show that, like the iV-particle wavefunction ^ (see eq. ([59]) ). 
the Fock operator F F [ip], as a function of the one-electron orbitals, is invariant 
under a unitary transformation such as the one in eq. (158p , In fact, this is true for 
each one of the sums of Coulomb and exchange operators in eq. (152p separately: 



/ 



\r — r 

i,' r„>\\2 



E y j ] -yz^[ dx j = E j j w\ . v^(x) , (62) 

where, in the step before the last, we have summed on j and I, using that 

T,j U kl U jl = 6 kl- 
Performing very similar calculations, one can show that 



Kj [Utf] <f(x)=J2 K [if/] <p{x) , Vp(x) , (63) 



and therefore, that F GHF [Uip'} = F GRF [ip'}. In such a way that any unitary 
transformation on a set of orbitals that constitute a solution of the Hartree-Fock 
equations in (153D yields a different set that is also a solution of the same equations. 
For computational and conceptual reasons (see, for example, Koopmans' Theorem 
below), it turns out to be convenient to use this freedom and choose the matrix U 
in such a way that the Lagrange multipliers matrix is diagonalized (see eq. (|56|) and 
the paragraph below it). The particular set of one-electron orbitals {tp^} obtained 
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with this U are called canonical orbitals and their use is so prevalent that we 
will circumscribe the forecoming discussion to them and drop the prime from the 
notation. 

Using the canonical orbitals, the Hartree-Fock equations can be written as 



F GHF [^}^(x) =E i ip i {x) , i = l,...,N. (64) 

Many of the remarks related to these equations are similar to those made about 
the Hartree ones in (|30f) . although there exist important differences due to the 
inclusion of the indistinguishability of the electrons in the variational ansatz. This 
is clearly illustrated if we calculate the joint probability density, associated to a 
wavefunction like the one in eq. (|32h . of the coordinates with label 1 taking the 
value x%, the coordinates with label 2 taking the value X2, and so on: 

p GUF ( Xl ,...,x N ) = \^f(x 1 ,...,x N )\ 2 = 

_1 £ (-1) TW+T ^; (1) (si)-^ (65) 
p,p' &s N 

If we compare this expression with eq. (|3ip . we see that the antisymmetry of 
^ has completely spoiled the statistical independence among the one-electron co- 
ordinates. However, there is a weaker quasi-independence that may be recovered: 
If, using the same reasoning about permutations that took us to the one-electron 
part hi \^) °f the energy functional in page 1181 we calculate the marginal 

probability density of the i-th coordinates taking the value Xj, we find 



P p HF (^):= J mdx k \p GHF (x 1 ,..., XN ) = ^Y,M x ^\ 2 - ( 66 ) 

\ky^i J j 



Now, since the coordinates indices are just immaterial labels, the actual proba- 
bility density of finding any electron with coordinates x is given by 



P G ™(x):=Y,P? nF (x) = EH x )\ > ( 67 ) 



which can be interpreted as a charge density (except for the sign), as, in atomic 
units, the charge of the electron is e = —1. The picture being consistent with the 
fact that p GHF (x) is normalized to the number of electrons N: 



J p GHF (x)dx = N . (68) 

Additionally, if we perform the same type of calculations that allowed to calculate 
the two-electron part of the energy functional in page PT9l we have that the two- 
body probability density of the i-th coordinates taking the value Xi and of the j-th 
coordinates taking the value Xj reads 
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/ n dxk ) p ghf (xx,..., XN ) = 

k (xi)\ 2 \ipi(xj)\ 2 - ^i>k( x i)i , i( x j)i>i( x i)i>k(xj) } , (69) 



and, if we reason in the same way as in the case of /3p HF (xj), in order to get to 
the probability density of finding any electron with coordinates x at the same time 
that any other electron has coordinates x', we must multiply the function above by 
N(N — 1) /2, which is the number of immaterial (i, j')-labelings, taking into account 
that the distinction between x and x' is also irrelevant: 



\ (El^^)| 2 Eh(^r-E^(^)^(^)^)^(^)) ■ (70) 

\ k I k,l 

Finally, taking eq. (|67p to this one, we have 



p GHF (x,x') = i [p GUF (x)p GnF (x')-^rk( x )^( x ')M x )M x ')) , (71) 
V k,l J 

where the first term corresponds to independent electrons and the second one, 
called the interference term, could be interpreted as an exchange correction. 

Although, in general, this is the furthest one may go, when additional constraints 
are imposed on the spin part of the one-electron wavefunctions (see the discussion 
about Restricted Hartree-Fock in the following pages, for example), the exchange 
correction in eq. (|7ip above vanishes for electrons of opposite spin, i.e., electrons 
of opposite spin turn out to be pairwise independent. However, whereas it is true 
that more correlation could be added to the Hartree-Fock results by going to higher 
levels of the theory and, in this sense, Hartree-Fock could be considered the first 
step in the 'correlation ladder', one should not regard it as an 'uncorrelated' ap- 
proximation, since, even in the simplest case of RHF (see below), Hartree-Fock 
electrons (of the same spin) are statistically correlated. All of this has its roots in 
the Pauli principle, which states that no pair of electrons can share all the quantum 
numbers. 

Let us now point out that, like in the Hartree case, the left-hand side of the 
Hartree-Fock equations in (|64h is a complicated, non-linear function of the or- 
bitals {ipi} and the notation chosen is intended only to emphasize the nature of 
the iterative protocol that is typically used to solve the problem. However, note 
that, while the Hartree operator H.k[<p] depended on the index of the orbital 4>k on 
which it acted, the Fock operator in eq. (I64h is the same for all the spin-orbitals 
ipi. This is due to the inclusion of the i = j terms in the sum of the Coulomb 
and exchange two-electron integrals in eq. (I47j) and it allows to perform the iter- 
ative procedure solving only one eigenvalue problem at each step, instead of N of 
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them like in the Hartree case (see however the UHF and ROHF versions of the 
Hartree-Fock problem in what follows). 

The one-particle appearance of eqs. (|64h is again strong and, whereas the 'eigen- 
values' £j are not the energies of the individual electrons, they are called orbital 
energies due to the physical meaning they receive via the well-known Koopmans' 
Theorem [60]. 

To get to this result, let us multiply eq. (|64"1) from the left by ij}i{x), for a given 
i, and then integrate over x. Using the definition of the Fock operator in eq. (I52j) 
together with the Coulomb and exchange ones in eqs. (|5ip . we obtain 

(rPi\ F GHF |^> =hi + J2 (Jij -Ki^=Ei, i = l,...,N, (72) 

j 

where we have used the same notation as in eq. (I47D and the fact that the one- 
electron orbitals are normalized. 

If we next sum on i and compare the result with the expression in eq. (I47p . we 
found that the relation of the eigenvalues E{ with the actual Hartree-Fock energy 
is given by 

£ GHF = J>4£(j,-i^). (73) 

i i,j 

Finally, if we assume that upon 'removal of an electron from the k-th orbital' the 
rest of the orbitals will remain unmodified, we can calculate the ionization energy 
using the expression in (j4"7j) together with the equations above: 

Yl hi - Yl hi + 2 Y ~ K v) 

(Jkj - tffci) = ~£k , (74) 

3 

and this is Koopmans' Theorem, namely, that the k-th ionization energy in the 
frozen- orbitals approximation is £}■■ 

Moving now to the issue about the solution of the Hartree-Fock equations in (|64p , 
we must remark that the necessity of using the relatively unreliable iterative ap- 
proach to tackle them stems again from their complicated mathematical form. Like 
in the Hartree case, we have managed to largely reduce the dimension of the space 
on which the basic equations are defined: from 3N in the electronic Schrodinger 
equation in (I9aj) to 3 in the Hartree-Fock ones. However, to have this, we have payed 
the price of dramatically increasing their complexity [55], since, while the electronic 
Schrodinger equation was one linear differential equation, the Hartree-Fock ones 
in (|64p are N coupled non-linear integro-differential equations, thus precluding any 
analytical approach to their solution. 

A typical iterative procedural begins by proposing a starting guess for the set 



1 The process described in this paragraph must be taken only as an outline of the one that is performed 
in practice. It is impossible to deal in a computer with a general function as it is (a non-countable infinite 
set of numbers), and the problem must be discretized in some way. The truncation of the one-electron 
Hilbert space using a finite basis set, described in sees. 151 and l9l is the most common way of doing this. 
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of N spin-orbitals {ipf}. With them, the Fock operator _F GHF [^°] in the left-hand 
side of eq. ([M]) is constructed and the set of ./V equations is solved as one simple 
eigenvalue problem. Then, the {ipj} that correspond to the N lowest eigenvalues 
e\ are selected (see the discussion of the aufbau principle below) and a new Fock 
operator _F GHF [?/> 1 ] is constructed with them. The process is iterated until (hope- 
fully) the n-th set of solutions {V'f'} differs from the (n — l)-th one {V^ -1 } l ess 
than a reasonably small amount (defining the distance among solutions in some 
suitable way typically combined with a convergence criterium related to the asso- 
ciated energy change). When this occurs, the procedure is said to have converged 
and the solution orbitals are called self- consistent; also, a calculation of this kind 
is commonly termed self-consistent field (SCF). 

Again, like in the Hartree case, many issues exist that raise doubts about the 
possible success of such an approach. The most important ones are related to 
the fact that a proper definition of the Hartree-Fock problem should be: find the 
global minimum of the energy functional {^/\H\^) under the constraint that the 
wavefunction ^ be a Slater determinant of one- electron spin-orbitals, and not: solve 
the Hartree-Fock equations (|64p . The solutions of the latter are all the stationary 
points of the constrained energy functional, while we are interested only in the 
particular one that is the global minimum. Even ruling out the possibility that a 
found solution may be a maximum or a saddle point (which can be done [50,55]), 
one can never be sure that it is the global minimum and not a local one. 

There exists, however, one way, related to the Hartree-Fock version of the the- 
orem by Simon and Lieb [57, 58] mentioned in the previous section, of hopefully 
biasing a particular found solution of eqs. (|64p to be the global minimum that we 
are looking for. They showed, first, that for neutral or positively charged molecules 
{Z > N), the Hartree-Fock global minimization problem has a solution (its unique- 
ness is not established yet [55]) and, second, that the minimizing orbitals {ipi} 
correspond to the N lowest eigenvalues of the Fock operator F GHF [^] that is self- 
consistently constructed with them. Therefore, although the reverse is not true in 
general [55] (i.e., from the fact that a particular set of orbitals are the eigenstates 
corresponding to the lowest eigenvalues of the associated Fock operator, does not 
necessarily follow that they are the optimal ones), the information contained in 
Simon and Lieb's result is typically invoked to build each successive state in the 
iterative procedure described above by keeping only the N orbitals that correspond 
to the N lowest eigenvalues e\. Indeed, by doing that, one is effectively constraining 
the solutions to have a property that the true solution does have, so that, in the 
worst case, the space in which one is searching is of the same size as the original 
one, and, in the best case (even playing with the possibility that the reverse of 
Simon and Lieb's theorem be true, though not proved), the space of solutions is 
reduced to the correct global minimum alone. This wishful-thinking way of pro- 
ceeding is termed the aufbau principle [55], and, together with a clever choice of 
the starting-guess set of orbitals [61] (typically extracted from a slightly less accu- 
rate theory, so that one may expect that it could be 'in the basin of attraction' of 
the true Hartree-Fock minimum), constitute one of the many heuristic strategies 
that make possible that the aforementioned drawbacks (and also those related to 
the convergence of iterative procedures) be circumvented in real cases, so that, in 
practice, most of SCF calculations performed in the field of quantum chemistry do 
converge to the true solution of eqs. (|64p in spite of the theoretical notes of caution. 



Moreover, the iterative procedure is normally performed using not the spin-orbitals but the spatial ones. 
In this sense, the restricted versions of the Hartree-Fock problem, discussed below, are closer to the actual 
implementation of the theory in computer applications. 
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Now, to close this GHF part, let us discuss some points regarding the imposi- 
tion of constraints as a justification for subsequently introducing three commonly 
used forms of the Hartree-Fock theory that involve additional restrictions on the 
variational ansatz (apart from those in eqs. (|32p and (|34p ). 

In principle, the target systems in which we are interested in our group and to 
which the theory developed in this work is meant to be applied are rather com- 
plex (short peptides, small ligands, etc.). They have many degrees of freedom and 
the different interactions that drive their behaviour typically compete with one an- 
other, thus producing complicated, 'frustrated' energy landscapes (see refs. [62-67], 
but note, however, that we do not need to think about macromolecules; a small 
molecule like CO2 already has 22 electrons). This state of affairs renders the a priori 
assessment of the accuracy of any approximation to the exact equations an impos- 
sible task. As researchers calculate more and more properties of molecular species 
using quantum chemistry and the results are compared to higher-level theories or 
to experimental data, much empirical knowledge about 'how good is Theory A 
for calculating Property X' is being gathered. However, if the characterization of 
a completely new molecule that is not closely related to any one that has been 
previously studied is tackled with, say, the Hartree-Fock approximation, it would 
be very unwise not to 'ask for a second opinion'. 

All of this also applies, word by word, to the choice of the constraints on the 
wavefunction in variational approaches like the one discussed in this section: For 
example, it is impossible to know a priori what will be the loss of accuracy due 
to the requirement that the A r -particle wavefunction $ be a Slater determinant 
as in eq. (132)) . However, in the context of the Hartree-Fock approximation, there 
exists a way of proceeding, again, partly based on wishful thinking and partly 
confirmed by actual calculations in particular cases, that is almost unanimously 
used to choose additional constraints which are expected to yield more efficient 
theories. It consists of imposing constraints to the variational wavefunction that 
are properties that the exact solution to the problem does have. In such a way that 
the obvious loss of accuracy due to the reduction of the search space is expected 
to be minimized, while the decrease in computational cost could be considerable. 

This way of thinking is clearly illustrated by the question of whether or not 
one should allow that the one-electron spin-orbitals ipi (and therefore the total 
wavefunction \£) be complex valued. Indeed, due to the fact that the electronic 
Hamiltonian in eq. (|2ip is self-adjoint, the real and imaginary parts of any complex 
eigenfunction solution of the time independent Schrodinger equation in (19aj) are also 
solutions of it [55]. Therefore, the ground-tate, which is the exact solution of the 
problem that we are trying to solve, may be chosen to be real valued. Nevertheless, 
the exact minimum will not be achieved, in general, in the smaller space defined by 
the Hartree-Fock constraints in eqs. (|32p and (|34p . so that there is no a priori reason 
to believe that allowing the Hartree-Fock wavefunction to take complex values 
would not improve the results by finding a lower minimum. In fact, in some cases, 
this happens [61]. Nevertheless, if one constrains the search to real orbitals, the 
computational cost is reduced by a factor two, and, after all, the whole formalism 
discussed in this work profits from the imposition of constraints (starting by the 
consideration of only one Slater determinant), all of which save some computational 
effort at the expense of a reduction in the accuracy. The search for the most efficient 
of these approximations constitutes the main part of the quantum chemistry field. 

Apart from these 'complex vs. real' considerations, there exist three further re- 
strictions that are commonly found in the literature and that affect the spin part of 
the one-electron orbitals tfti . The A^-electron wavefunction ^ of the GHF approxi- 
mation (which is the one discussed up to now) is not an eigenstate of the total-spin 
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1 1/2 1/4 
CoGHF CoUHF *~ CoRHF 

\ l l 

ReGHF ReUHF *~ ReRHF 

1/2 1/4 1/8 

Figure 1. Schematic relation map among six types of Hartree-Fock methods discussed in the text: 
General Hartree-Fock (GHF), Unrestricted Hartree-Fock (UHF) and Restricted Hartree-Fock (RHF), in 

both their complex (Co) and real (Re) versions. The arrows indicate imposition of constraints; 
horizontally, in the spin part of the orbitals, and, vertically, from complex- to real-valued wavefunctions. 
Next to each method, the size of the search space relative to that in CoGHF is shown. 

operator, S 2 , nor of the z-component of it, S z [61]. However, since both of them 
commute with the electronic Hamiltonian in eq. (|2ip , the true ground-state of the 
exact problem can be chosen to be an eigenstate of both operators simultaneously. 
Therefore three additional types of constraints on the spin part of the GHF wave- 
function in (|32p are typically made that force the variational ansatz to satisfy these 
ground-state properties and that should be seen in the light of the above discus- 
sion, i.e., as reducing the search space, thus yielding an intrinsically less accurate 
theory, but also as being good candidates to hope that the computational savings 
will pay for this. 

The first approximation to GHF (in a logical sense) is called Unrestricted Hartree- 
Fock (UHF) and it consists of asking the orbitals ipi to be a product of a part (f>i{r) 
depending on the positions r times a spin eigenstate of the one-electron s z operator, 
i.e., either a(a) or (3(a) (see eq. (|36j) ). This is denoted by ipi(x) := (j>i(r)%(o~), 
where 7$ is either the a or the (3 function. Now, if we call iV a and Np the number 
of spin-orbitals of each type, we have that, differently from the GHF one, the UHF 
iV-particle wavefunction *!' is an eigenstate of the S z operator with eigenvalue 
(\/2)(N a — Np) (in atomic units, see sec. [2]). However, it is not an eigenstate of 
S 2 and, when this deviation results into a poor description of the observables in 
which we are interested, we talk about spin contamination [68]. Although the UHF 
wavefunction can be projected into pure 5 2 -states, the result is multideterminantal 
[61] and will not be considered here since it spoils many of the properties that render 
Hartree-Fock methods a low-cost choice. 

Regarding the computational cost of the UHF approximation, it is certainly lower 
than that of GHF, since the search space is half as large: In the latter case, we 
had to consider 2N (complex or real) functions of IR 3 (the ipf(r) and the (p?(r), 
see eq. (|35p ). while in UHF we only have to deal with N of them: the <f>i(r). 

Now, if we introduce into the expression for the GHF constrained functional 
in (|48p the following relations that hold for the UHF spin- and spatial orbitals^ 



h = (fcl h \(f>i 



(i>ii>j\ - IV^j) 



1 

k<t>j \ - \4>i4>j) , 



(75a) 
(75b) 

(75c) 



1 Of course, the average values at both sides of the expressions are taken over different variables: over x and 
x' on the left-hand side, and over r and r' on the right-hand side. Also, let us remark that, although placing 
functions as arguments of the Kronecker's delta 5 7i7j is a bit unorthodox mathematically, it constitutes 
an intuitive (and common) notation. 
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we may perform a derivation analogous to the one performed for the GHF case, 
and get to a first version of the UHF equations 



(N N \ 

3 3 J 3 

»i(r)=£ A i^K r ) , (76b) 



hUHF r 



for all % = 1, . . . ,N and where the Coulomb and exchange operators dependent 
on the spatial orbitals fa are defined by their action on an arbitrary function (p(r) 
as follows: 



JM <p{r) := (| dr') <p(r) , (77a) 

/ r 6*Ar')<p(r') \ 
Kj[fa <p{r) :=U 3 lr _ r \ dr'J fc(r) . (77b) 

Now, one must note that, differently from the GHF case, due to the fact that 
the exchange interaction only takes place between orbitals 'of the same spin', the 
UHF Fock operator .F i UHF [<^] depends on the index i. This precludes the solution 
of UHF as a single pseudoeigenvalue problem (c.f. eq. (|64p ) and makes necessary 
some further considerations in order to arrive to a more directly applicable form 
of the expressions: 

First, although the equations for fa and fa\ in (|76p can be combined in the same 
way as in the GHF case to yield the Hermiticity conditions Xij = A*j, there are 
fewer Lagrange multipliers in the UHF scheme than in the previous derivation. 
To see this, one must notice that the orthogonality constraints must be imposed 
on the spin- orbitals, not on the spatial orbitals (see eq. (|34p ) . Therefore, since two 
UHF spin-orbitals ipi = fact and ipj = faj3 are orthogonal no matter the value 
of (fa\(f)j) due to the different spin parts, the corresponding Lagrange multiplier 
Xij needs not to be included in the constrained functional from which the UHF 
equations come. This may be incorporated into the formalism by simply using that 
the matrix A := (Xij) in eqs. (|76|) presents the following block-diagonal form: 



A UHF := I . ) , (78) 




where we have assumed (without loss of generality) that the UHF spin-orbitals 
are ordered in such a way that the a ones occur first, and indicates a block of 
zeros of the appropriate size. (Of course, redundant constraints may be imposed on 
the orbitals by, for example in this case, including matrix terms in A that connect 
the a and (3 spaces. However, in order to know the exact freedom we have in the 
choice of the constraints, it is convenient to use the minimal number of Lagrange 
multipliers. If this approach were not followed, for example, the discussion below 
about the diagonalization of A a and A' 3 would become much less direct.) 

The next step consists in noticing that, despite the dependence of i^ UHF on the 
orbital index i in eqs. (|76p . there are actually only two different Fock operators: 
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one for the a orbitals and one for the (5 ones. Defining the sets of indices A := 
< i < N a } and B := {i\N a + 1 < i < N a + Np = N}, we can write these two 
a and (5 operators: 



Y^KM , (79a) 

jeA 

Yj^M • (7%) 

jeB 

With them, and using the particular structure of the matrix A UHF in (j78[) . the 
original UHF equations in (|76[) are split into two disjoint sets of expressions that are 
only coupled through the Fock operators on the left hand sides, plus the Hermiticity 
condition: 

Fa HF Wi(r) = £ KM r ) > if * G A . ( 80a ) 

jeA 

F^ F m(r) = £ Ag^(r) , if i G B , (80b) 

Aij = A*, . (80c) 

The last step needed to arrive to the final form of the UHF equations (found by 
Pople and Nesbet [69] and named after them) is the diagonalization of both the A a 
and A! 3 Hermitian matrices above. In order to achieve this, the orbitals <j>i must be 
transformed similarly to the GHF case. However, this is trickier than it was then, 
since not only the iV-electron wavefunction ^ and the Fock operators must remain 
invariant under the sought transformation, but also the UHF constraints must be 
kept. 

As we saw before, any unitary transformation U in the set of spin-orbitals rpi like 
the one in eq. (|58p is physically legitimate, since it changes the Slater determinant 
by only an unmeasurable phase and leave the Fock operators invariant. Then, if 
we write each spin-orbital as in ([35]): 

Mx) = tf(r)a(a) + ^(r)P(a) , (81) 
we can make use of ([58]) . to obtain 

(r) a(a) + ^f(r) (3(a) = £ U l3 [pf{r) a{a) + <pf(r) P(a)] . (82) 

j 

By setting a = —1/2 and a = 1/2 in this expression, we see that any transfor- 
mation U in the spin-orbitals ipi induces exactly the same transformation in their 
spatial components tpf and ip? , 

^(r) = X> i¥ >7(r), 7 = a,0. (83) 

3 



nUHFr 



pUHF r 



N 

3=1 
N 

3=1 
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Now, if we order the sets of spin and spatial orbitals: ij) T := (ipi, . . . ,i/jn) and 
(<p 7 ) T := (ipj, . . . , ipjf), with 7 = a, (3, we can express the UHF constraints by 
saying that the <£> 7 must have the form 

(cp a ) T = (</>!,. ..,ct> Na ,0,...,Q) , (84a) 
(^) T = (0,..., 0,^+1,...,^+^) • (84b) 

Then, since the fact that the a orbitals appear first constitutes no loss of gener- 
ality, we must ask the transformed (^. 7 , with 7 = a, /?, to have also the structure 
in (|84p if we want to remain inside the UHF scheme. As a consequence, and due to 
the linear independence among the orbitals, not every unitary matrix U is allowed, 
but only those of the form 



( U a \ 
using the same notation as in eq. ([751) . 

This can be easily proved by focusing on a particular value for i and 7 in eq. (1830 , 
say, i S A and 7 = /3. Due to the UHF constraints in (|84p . we know that the left- 
hand side of such an expression is zero and that only spatial orbitals with j £ B 
appear in the sum on the right-hand side, yielding the relation = Ylj^B ^ij4 >l j{ r )- 
But the (f>j(r), with j £ B, form an orthonormal, and therefore linearly independent 
set, so that the only possibility that such a relation can hold is that all coefficients 
Uij be zero. By repeating this for alH G A and, then, for 7 = 0, the result follows. 

Fortunately, this limited freedom in the choice of U is still enough to indepen- 
dently diagonalize A a and A 13 in eq. (|80h (which are both Hermitian) by suitably 
choosing the unitary submatrices U a and XJ^ respectively. 

This takes us to the final, diagonal form of the UHF equations, the Pople-Nesbet 
equations [69]: 

F Q UHF M<Mr) = efUr) , i£ i € A , (86a) 
F^Wfair) = efyiir) , if % G B . (86b) 

Although these equations are coupled through the Coulomb term in the Fock 
operators on the left-hand side, at each step of the iterative SCF procedure, they 
can be solved as two independent eigenvalue problems. This has allowed to imple- 
ment them in most quantum chemical packages and it has made UHF calculations 
now routine. 

To close the UHF discussion, we shall now study the statistical properties of 
the probability densities appearing in this model. If we introduce the special form 
of the UHF orbitals in (I84h into the general expression in (j70p . we can calculate 
the two-body probability density of finding any electron with coordinates x at the 
same time that any other electron has coordinates x': 
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p UHF (*,*') = ~ ( Y.\Mr)\ 2 \Mr')\ 2 7k(*)ji(<T') 
V k,l 



~ E ^(r) 0?(r') Mr) Mr') M°) ll{°') 7*00 111?) • (87) 
k,l ) 

If we compute this probability density for 'electrons of the same spin', i.e., for 



a = a', we 



obtair0 



p^ F (r,r';a = a') = 
\ E (\Mr)\ 2 \Mr')\ 2 -Mr)<Pl(r')Mr)Mr') 



2 

k.ieia 



- I p VRF (r,a)p VUF (r\a)-Y,K«h«Mr)<f>l(r')Mr)Mr') 

k,l 



where the following expression for the one-electron charge density has been used: 



p lJHF (r,a)=^2\Mr)\lk(a) . (89) 

k 

At this point, note that eq. (|88p contains, like in the GHF case, the exchange 
correction to the first (independent electrons) term. Nevertheless, as we advanced, 
if we calculate the two-body p UHF for 'electrons of opposite spin', i.e., for a ^ a', 
we have that 



^(r.rWoO = ^ UHF (r,a)p UHF (rV) , (90) 

i.e., that UHF electrons of opposite spin are statistically pairwise independent. 

There is another common approximation to GHF that is more restrictive than 
UHF and is accordingly called Restricted Hartree-Fock (RHF). Apart from asking 
the orbitals ipi to be a product of a spatial part times a spin eigenstate of the one- 
electron s z operator (like in the UHF case), in RHF, the number of 'spin-up' and 
'spin-down' orbitals is the same, N a = Nq (note that this means that RHF may 
only be used with molecules containing an even number of electrons), and each 
spatial wavefunction occurs twice: once multiplied by a(a) and the other time by 
f3(a). This is typically referred to as a closed-shell situation and we shall denote it 
by writing ipi(x) := M r ) ot{p) if i < N/2, and ipi(x) := 4>i-N/2(r) P{p) if % > N/2; 
in such a way that there are N/2 different spatial orbitals denoted by (f>i(r), with, 
1 = 1,... ,N/2. Using the same notation as in (fMj) . the RHF constraints on the 
spin-orbitals are 



1 Placing a function and a coordinate as arguments of the Kronecker's delta S-y k<T is even more unorthodox 
mathematically than placing two functions (in fact, &- lk< y is exactly the same as 7/c(c)), however, the 
intuitive character of the notation compensates again for this. 
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(v a ) T = Oi,...,^v /2 ,0,...,0) 
(^) T = (O,...,O,0 1 ,...,^/ 2 ) 



(91a) 
(91b) 



Due to these additional restrictions, we have that, differently from the GHF 
and UHF ones, the RHF iV-particle wavefunction ^ is an eigenstate of both the 
S 2 and the S z operators, with zero eigenvalue in both cases [19,61], just like the 
ground-state of the exact problem. I.e., there is no spin- contamination in RHF. 

Regarding the computational cost of the RHF approximation, it is even lower 
than that of UHF, since the size of the search space has been reduced to one quarter 
that of GHF: In the latter case, we had to consider 2N (complex or real) functions 
of R 3 (the (ff(r) and the Lp^(r), see eq. (|35p ). while in RHF we only have to deal 
with N/2 of them: the </>/(r) (see fig. [[]). 

Now, using again the relations in (|75p . we can derive a first version of the RHF 
equations: 



F 



RHF r 



N/2 

h + J2[2Jj[ 

J 



K 



N/2 



(92) 



for all I = 1, . . . , N/2, where we have used that the RHF Fock operator .F RHF [</>], 
differently from the UHF case, does not depend on the index / of the orbital 
on which it operates, and following the same steps as before, the minimal La- 
grange multipliers matrix needed to enforce the orthogonality constraints among 
the spinorbitals in the RHF case is 



A 



RHF 



AW 2 >/2 
AW 2 )/2 



(93) 



where, this time, A^/ 2 -* := (A/j) is an arbitrary N/2 x N/2 Hermitian matrix, 
and the 1/2 has been included in order to get to the classical RHF equations in (192p 
without irrelevant numerical factors. 

Using the same arguments as for UHF, it is clear that, in order to 'remain inside 
RHF' upon an unitary transformation of the spin-orbitals ipi, not every unitary 
matrix U is allowed, but only those of the form 



U 



RHF 



/U(N/2) o 

[ o uwv 



(94) 



Finally, by suitable choosing the A^/2 x A^/2 unitary block U^ N ^ 2 \ the Hermitian 
matrix can be diagonalized and the final, diagonal form of the RHF equations 

can be written: 



F 



RHF r 



\<t>i(r) 



N/2 



2JM-KM 



4>i{r) = ej(f>i(r) 



(95) 
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with I = l,...,N/2. 

As we can see, this version of the Hartree-Fock theory can be numerically solved 
as a single pseudoeigenvalue problem. This, together with the aforementioned small 
size of the RHF space, has made the RHF approximation (in its real- valued version) 
the first one cast into a computationally manageable form [79, 80] and the most 
used one in recent literature [14, 70-78] (for molecules with an even number of 
electrons) . 

Now, in order to investigate the statistical features of RHF, if we introduce 
the special form of the orbitals in (|9ip into the general expression in (170p . we 
can calculate the RHF two-body probability density of finding any electron with 
coordinates x at the same time that any other electron has coordinates x': 



(N/2 N/2 
]T \<t>K{r)\ 2 \<j>L{r')\ 2 - <W £ <f>* K (r) cj>* L {r') Mr) M(r') 
K,L K,L 

ll N/2 \ 

- ip(r,a)p(r',a , )-5 (TCT/ Y. L < l ) K( r )Mr')Mr)M(r') \ ■ (96) 

where the following expression for the RHF one-electron charge density has been 
used: 



N/2 

p RKF (r,a) = Y,\M(r)\ 2 • (97) 

K 

We notice that the situation is the same as in the UHF case: For RHF electrons 
with equal spin, there exists an exchange term in p^ F (x, x') that corrects the 
'independent' part, whereas RHF electrons of opposite spin are statistically pairwise 
independent. 

Finally, if we follow the same steps as for GHF, in page l26l we can relate the 
RHF energy to the eigenvalues ej and the two-electron spatial integrals: 

N/2 N/2 I \ 

E = 2^2 £I -J2 h(Mj\ - \hM ~ (Mj\ - \<PjM\ • (98) 

To close this section, we shall discuss a fourth flavour of Hartree-Fock which is 
called Restricted Open-shell Hartree-Fock (ROHF). Compared to the rest of vari- 
ants, ROHF is the most difficult to derive theoretically and it shall be described 
here only in an introductory manner. The ROHF wavefunction is somewhat be- 
tween the RHF and the UHF ones, and both of them can be obtained as particular 
cases of the ROHF scheme (the RHF one provided that the molecule has an even 
number of electrons). In the (monodeterminantal) ROHF case, the one-electron 
spin orbitals tpi are constrained to be of two different types: 2Nd of them are dou- 
bly occupied, like in the RHF case, in such a way that they are formed by only No 
spatial orbitals (f> a , each one of them appearing once multiplied by a(o~) and once 
by /3(cr). The associated 2Nd spin-orbitals are said to belong to the closed shell 
part of the wavefunction. The remaining N$ := N — 2Np ones are singly occupied, 
like in the UHF case, and are said to belong to the open shell. Among them, N a 
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are multiplied by an a(a) spin part, and Np by (3(cr). 

If we number the whole set of ROHF spatial orbitals <p a , with a = 1, . . . , Njj + 
N a + Np, in such a way that the doubly occupied ones occur first, with a 6 D : = 
{a\l < a < Nd}, then the alpha ones, with a £ A := {a|A?£) + 1 < a < iVo + iV a }, 
and finally the beta ones, with a <E B := {a\N D + N a + 1 < a < N D + N a + A^}, 
we can express the ROHF constraints on the spin-orbitals using the same notation 
as in ([84]) and flU]): 

(v a ) T = ( 0i , ■ ■ ■ , 4>n d , 4>n d +i , • • • , (f>N D +N„ ,0, ■ • • , 0) , (99a) 

AT D AT„ N D +Np 

(^) T = (0, • • • , 0, 01 , • • • , 0jVp , <f)N D +N a +l , • • • , (j)N D +N a +Nff ) , (99b) 

where the particular ordering has been chosen in order to facilitate the forecoming 
calculations. 

The general monodeterminantal ROHF wavefunction considered here and con- 
structed using the above constraints has the same spin properties as the UHF one, 
i.e., it is an eigenstate of the the S z operator with eigenvalue (l/2)(N a — Np), but 
it is not an eigenstate of S 2 . However, in the particular (and common) case in 
which all the N$ open shell orbitals are constrained to present parallel spin parts 
(either all a or all (3), the ROHF wavefunction becomes an eigenstate of the S 2 
operator too, with eigenvalue ^r(%- + 1), thus avoiding the problem of spin con- 
tamination [81]. In order to construct wavefunctions with the same S 2 but lower 
S z , several ROHF Slater determinants must be linearly combined. The subtleties 
arising from such a procedure are beyond the scope of this review; the interested 
reader may want to check references [81-84], which discuss this topic. 

Regarding the size of the variational space in ROHF, it is somewhere between 
UHF and RHF, depending on the 2N D /(N a + Np) ratio. 

Now, if the ROHF constraints in (|99p are imposed on the GHF energy in eq. (|47p . 
the ROHF analogue in terms of the spatial orbitals 4> a can be calculated: 



a 




(100) 



where the f a are a sort of 'occupation numbers' that take the value f a = 2 when 
a G D (i.e., when it corresponds to a closed shell orbital) and f a = 1 otherwise. 
The matrix g := (g a b) is defined as 





(2 D 


1 \ 




v 1 


l a 

o if) 



being 2 D a N D x N D box of 2's, l a and I 13 , N a x N a and Np x Np boxes of l's 
respectively. The off-diagonal blocks contain in all elements the number indicated 
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and are of the appropriate size. 

Then, as we did for GHF, in order to derive the ROHF equations, we construct 
the functional for the conditioned stationary values problem, adding to the en- 
ergy in (|100p the Lagrange multipliers terms needed to enforce the orthonormality 
constraints: 



T [{<M] = £ R ° HF [{<M] ~ J] X ab " Sab) • (102) 

a, b 

Now, like in all restricted HF cases, special attention must be payed to the 
structure of the matrix A ROHF := (X a b), since the requirement is that all spin- 
orbitals be orthogonal, not the spatial orbitals. In the ROHF case, this leads to 
explicitly impose the orthonormality conditions (of course) inside the three sets 
of (j) a : the doubly occupied, the alpha and the beta ones; but also between the 
doubly occupied and the alpha ones, and between the doubly occupied and the 
beta ones. The orthormality between the alpha and beta sets, however, needs not 
to be enforced, since the associated spin-orbitals are already orthogonal due to the 
different spin parts. 

These considerations lead to the following form for the minimal Lagrange mul- 
tipliers matrix: 



A 



ROHF 



/ A D A Da A D ^ 

(A Dq )+ A a 
\(A^)+ A^ 



(103) 



where the notation used for the different blocks connecting the doubly occupied, 
alpha and beta shells is self-explanatory, and the fact that A ROHF is Hermitian 
after reaching stationarity has been advanced. 

Next, we impose the condition of zero functional derivative on the functional 
in (I102p . and obtain a first version of the ROHF equations: 

Fa OEF Wa(r) = Yl X abMr) , a = 1, . . . , N D + N a + Np , (104a) 

b 

F™ HF [ct>] K(r) = J] ha<t>t(r) , = 1,...,^+^ + ^, (104b) 

b 



where the ROHF operator F ROHF [(/)] is defined as 



^a ROHF M := fah + ( fafbMfl ~ 9abk b [4>] , (105) 
b ^ ' 

and the Hermiticity property of the Lagrange multipliers matrix A ROHF follows 
from conjugation and subtraction in eqs. (|104p . 

Again, although the Fock operator depends on the index a of the orbital upon 
which it operates, this dependence presents a very particular structure, yielding 
only three different types of operators: 



February 1, 2008 5:24 Molecular Physics 



introSCF 



38 



Pablo Echenique and J. L. Alonso 



tiROHF 
D 



? ROKF 



F, 



ROHF 



[<t>] ■■- 



[cp]:- 



2h + Y J hfbM4>}- hKb[4>. 
b ^ 

h+ Yl (jb[<f>]-K b [<t>}\ 

b£{DuA) ^ ' 

h+ J2 (jb[<t>}-K b [<t>}) 



a£D 



a £ A 



(106a) 
(106b) 
(106c) 



bG(DUB) 



Note that, in the particular case that we had no open-shell orbitals, the operator 
for the doubly occupied ones does not reduce (as it should) to the RHF one in ([92]) . 
This is because we hid a factor 2 in the definition of the Lagrange multipliers in 
the RHF derivation. 

At this point, it would be desirable to continue with the same program that we 
followed in the UHF and RHF cases and diagonalize the matrix A ROHF in order 
to arrive to a system of three pseudoeigenvalue equations, coupled only via the 
Fock operators. Nevertheless, this is not possible in ROHF, as we shall show in the 
following lines, and it is the root of ROHF being the most tricky flavour in the 
Hartree-Fock family. The obstruction to achieve this diagonalization comes from 
the fact, already used in the UHF case, that not every unitary transformation of 
the spin-orbitals is allowed if we want to remain inside the ROHF scheme, i.e., 
if we want that the transformed spin-orbitals satisfy the same ROHF constraints 
in (I99p that the untransformed ones did. 

To begin with, if we recall that the doubly occupied spatial orbitals must be 
orthogonal to both the alpha and the beta ones, we can use the same reasoning 
as in page [32] to show that the alpha-beta connecting parts of the allowed unitary 
matrix in this case must be zero. Hence, we can write 



U 



ROHF 



V 



jjD jjDa 
jjiDol jja 


\ 





u ,D u d p 


U'D/3 uP J 



(107) 



where the size of each block may be easily found from (|99p and the notation is 
again self-explanatory. 

Now, in order to obtain further restrictions to the form of {7 ROHF ; we write the 
transformation of the two spin-orbitals in the closed shell that correspond to the 
same spatial orbital cp a , with a £ A. To this end, we use (|107p . (|99l) . and the 
appropriate lines in the first and third lines of blocks in (I107p : 



Mr) = £ D2#(r) + £ Ufrtiir) = £ E/£#(r) + £ U^^r) . (108) 

beD beA beD beB 

Using the orthogonality relations inside and among the three sets of spatial 
orbitals, we multiply the above expression by any (ft c (r), with c E D and integrate 
on r. From this, the equality of U D and U' D follows, and the corresponding sums 
in (|108p subtract to zero, yielding 
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30 



beA 



uS,'<K(r) 



beB 



06 (r) 



(109) 



which equates one vector in the linear space spanned by the alpha orbitals to 
another in the linear space spanned by the beta ones. 

However, only the zero vector can belong to both spaces if we want the ROHF 
assumptions regarding the spin-orbitals to hold. To see this, we can use a reductio 
ad absurdum type of argument: Assume that both sides of fllQ9f) are different from 
zero. Then, we may perform a unitary transformation changing only the alpha and 
beta spaces, and with no elements connecting the two. This is allowed, since it 
does not change neither the TV-electron wavefunction, nor the Fock operators, nor 
the ROHF constraints. Now, if we select the partial unitary transformations in the 
alpha and beta sets so that they turn the vectors at both sides of (|109p into single 
elements in the bases of their respective spaces, we have that a single alpha orbital 
equals a beta one. Although this can happen in particular cases, we cannot ask it 
or we would be changing the fundamental assumptions made in (|99p . Therefore, 
both sides of (|109p must be zero, and, since the alpha and beta sets are linearly 
independent, all coefficients must be zero too. 

The proof that JJ fDa and U' D @ are also zero is performed using similar arguments, 
and the final form of [/ ROHF satisfying all the restrictions reads 



U 



ROHF 



/U D 
U a 











U D 
J 



(110) 



Finally, if we write the associated matrix using the a indices, i.e., operating on 
the set of spatial orbitals with the doubly occupied ones unrepeated: 



U 



ROHF 




(111) 



then, the transformed A' ROHF is related to the original one in (|103|) through 
simple matrix multiplication: A' = U + AU (dropping the ROHF superindices). It 
is clear that such a restricted f7~ ROHF does not operate on the off-diagonal blocks 
of A ROHF in (|103p and, therefore, the sought diagonalization is not possible. 

Using the fact that, however, the A D , A a and A@ do allow to be diagonalized, 
we can write the final, simplest possible form of the ROHF equations (forgetting 
their complex conjugate counterparts): 
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F^WMr) = e°Mr) + £ A£V 6 (r) + £ \^%(r) , aeD, (112a) 

beA beB 

F™ UF l<P}Mr) = e a a Mr) + £(A£Tfc(r) , as A, (112b) 

beD 

^ OHF [^ a (r) = ^ a (r) + ^(A?)> 6 (r), a <E B , (112c) 

where the superindices in the matrix elements are only written for visual conve- 
nience when comparing with (|103p . 

In this final form, it is evident that the off-diagonal elements of the Lagrange 
multipliers matrix, i.e., those related to the orthogonality constraints between the 
closed and open shells, introduce a coupling in the right-hand side of the ROHF 
equations that completely spoils the possibility of casting them into pseudoeigen- 
value ones. In the previous UHF and RHF versions, all orthogonality constraints 
were handled by simply choosing a special basis in the space spanned by the spatial 
orbitals in which the Lagrange multipliers matrix was diagonal. However, in the 
ROHF scheme, there is no such basis and we must deal with the problem in a 
different way, resulting into higher computational and theoretical difficulty. 

Since the pioneering work by Roothaan [85] , the solution of the ROHF problem 
has been attempted by distinct means, ranging from directly tackling the ROHF 
equations in (|112p explicitly forcing the orthogonality constraints [86,87], to the 
construction of a so-called unified coupling operator [85,89,90], which allows to 
turn the ROHF scheme into a single pseudoeigenvalue problem at the price of 
introducing certain ambiguities in the one-electron orbital energies [82,88]. The 
details and subtleties involved in these issues are beyond the scope of a review of 
the fundamental topics such as this one. The interested reader may want to check 
the more specialized accounts in [81-84]. 



8 The Roothaan-Hall equations 

The Hartree-Fock equations in the RHF form in expression ([95]) are a set of 
N/2 coupled integro-differential equations. As such, they can be tackled by finite- 
differences methods and solved on a discrete grid; this is known as numerical 
Hartree-Fock [91], and, given the present power of computers, it is only applicable 
to very small molecules. 

In order to deal with larger systems, such as biological macromolecules, inde- 
pendently proposed by Roothaan [79] and Hall [80] in 1951, a different kind of 
discretization must be performed, not in R 3 but in the Hilbert space TL of the 
one-electron orbitals. Hence, although the actual dimension of Ti is infinite, we 
shall approximate any function in it by a finite linear combination of M differ- 
ent functions Xq0- In particular, the one-electron orbitals that make up the RHF 
wavefunction, shall be approximated by 



1 In all this section and the next one, the indices belonging to the first letters of the alphabet, a, 6, c, d, 
etc., run from 1 to M (the number of functions in the finite basis set); whereas those named with capital 
letters from / towards the end of the alphabet, /, J, K, L, etc., run from 1 to N/2 (the number of spatial 
wavefunctions <pj, also termed the number of occupied orbitals). 
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caixa(r), I=l,...,N/2, M>-. (113) 

a 

In both cases, numerical Hartree-Fock and discretization of the function space, 
the correct result can be only be reached asymptotically; when the grid is very fine, 
for the former, and when M — > oo, for the latter. This exact result, which, in the 
case of small systems, can be calculated up to several significant digits, is known 
as the Hartree-Fock limit [92]. 

In practical cases, however, M is finite (often, only about an order of magni- 
tude larger than N/2) and the set {Xa}%Li in the expression above is called the 
basis set. We shall devote the next section to discuss its special characteristics, 
but, for now, it suffices to say that, in typical applications, the functions Xa are 
atom-centred, i.e., each one of them has non-negligible value only in the vicinity 
of a particular nucleus. Therefore, like all the electronic wavefunctions we have 
dealt with in the last sections, they parametrically depend on the positions R of 
the nuclei (see sec. [3J. This is why, sometimes, the functions Xa are called atomic 
orbital^ (AO) (since they are localized at individual atoms), the (pi are referred 
to as molecular orbitals (MO) (since they typically have non-negligible value in 
the whole space occupied by the molecule), and the approximation in eq. (|113p is 
called linear combination of atomic orbitals (LCAO). In addition, since we volun- 
tarily circumscribe to real-RHF, we assume that both the coefficients c a i and the 
functions Xa in the above expression are real. 

Now, if we introduce the linear combination in eq. (| 1 1 3 1) into the Hartree-Fock 
equations in (j95j) . multiply the result from the left by Xb (for a general value of b) 
and integrate on r, we obtain 

^2F ba c aI = e I Y,S ba c a i , /= l,...,N/2 , b = 1,...,M , (114) 

a a 

where we denote by Fb a the (b, a)-element of the Fock matrix^, and by Sba the 
one of the overlap matrix, defined as 

Fba ■= (Xb\ F[<p] \xa) and S ba := (xb\Xa) , (H5) 

respectively. 

Note that we do not ask the Xa in the basis set to be mutually orthogonal, so 
that the overlap matrix is not diagonal in general. 

Next, if we define the M x M matrices F[c] := (F a b) and S := (S a b), together 
with the (column) M- vector cj := (c a j), we can write eq. (j!14p in matricial form: 

F[c] Cl = ei Scj . (116) 



1 Some authors [18] suggest that, being strict, the term atomic orbitals should be reserved for the 
one-electron wavefunctions <f>j that are the solution of the Hartree-Fock problem (or even to the exact 
Schrodinger equation of the isolated atom), and that the elements \ a in the basis set should be termed 
simply localized functions. However, it is very common in the literature not to follow this recommendation 
and choose the designation that appear in the text [18,79,94]. We shall do the same for simplicity. 

2 Note that the RHF superindex has been dropped from F. 
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Hence, using the LCAO approximation, we have traded the N/2 coupled integro- 
differential Hartree-Fock equations in (195p for this system of N/2 algebraic equa- 
tions for the N/2 orbital energies e\ and the M ■ N/2 coefficients c a i, which are 
called Roothaan-Hall equations [79,80] and which are manageable in a computer. 

Now, if we forget for a moment that the Fock matrix depends on the coefficients 
c a i (as stressed by the notation F[c]) and also that we are only looking for N/2 
vectors cj while the matrices F and S are M x M, we may regard the above 
expression as a M-dimensional generalized eigenvalue problem. Many properties 
are shared between this kind of problem and a classical eigenvalue problem (i.e., 
one in which S ao = 5 ao ) [79], being the most important one that, due to the 
Hermiticity of F[c], one can find an orthonormal set of M vectors c a corresponding 
to real eigenvalues e a (where, of course, some eigenvalue could be repeated). 

In fact, it is using this formalism how most of actual Hartree-Fock computations 
are performed, although the reader must also note that other approaches, in which 
the orthonormality constraints are automatically satisfied due to the choice of vari- 
ables also exist in the literature [93] . The general outline of the iterative procedure 
is essentially the same as the one discussed in sec. [7) Choose a starting-guess for 
the coefficients c a j (let us denote it by c®j), construct the corresponding Fock ma- 
trix i^c ]]] and solve the generalized eigenvalue problem in eq. (|116p . By virtue of 
the aufbau principle discussed in the previous section, from the M eigenvectors c a , 
keep the N/2 ones c\ that correspond to the N/2 lowest eigenvalues ej, construct 
the new Fock matrix -Ffc 1 ] and iterate (by convention, the eigenvalues e", for all 
n, are ordered from the lowest to the largest as a runs from 1 to M). This pro- 
cedure ends when the n-th solution is close enough (in a suitable defined way) to 
the (n — l)-th one. Also, note that, after convergence has been achieved, we end 
up with M orthogonal vectors c a . Only the N/2 ones that correspond to the low- 
est eigenvalues represent real one-electron solutions and they are called occupied 
orbitals; the M — N/2 remaining ones do not enter in the A^-electron wavefunction 
(although they are relevant for calculating corrections to the Hartree-Fock results) 
and they are called virtual orbitals. 

Regarding the mathematical foundations of this procedure, let us stress, however, 
that, whereas in the finite-dimensional GHF and UHF cases it has been proved that 
the analogous of Lieb and Simon's theorem (see the previous section) is satisfied, 
i.e., that the global minimum of the original optimization problem corresponds to 
the lowest eigenvalues of the self-consistent Fock operator, in the RHF and ROHF 
cases, contrarily, no proof seems to exist [55]. Of course, in practical applications, 
the positive result is assumed to hold. 

Finally, if we expand F a ^ in eq. (|115p . using the shorthand \a) for \xa), we have 



where we have introduced the density matrix D c d[c], and also the matrix G^, 
made up by the two-electron four-centre integrals {ac\ 1/r \bd) (also called electron 
repulsion integrals (ERIs)) defined by 



1 Note (in eq. (11171 . for example) that the Fock matrix only depends on the vectors c a with a < N/2. 




Gcd 
ab 



(117) 



D cd [c] 
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(ac\ l - \bd) := // X a (r)Xc(rO Xft (r) Xd (rQ ^ 
t JJ \r — r'\ 

It is also convenient to introduce the Coulomb (J a b[c\) and exchange (K ab [c\) 
matrices 

Jab[c] := D cd[c](ac\ i \bd) , (119a) 

c,d 

K ab [c] :=Y,D cd [c]{ac\ ~\db) , (119b) 

c,d 

in terms of which, the Fock operator in eq. (|117p may be expressed as 

F ab = (a\ fi \b) + 2J ab [c] - K ab [c] . (120) 

After SCF convergence has been achieved, the RHF energy in the finite- 
dimensional case can be computed using the discretized version of eq. (i98j) : 

N/2 N/2 / l 1 \ 

# = 2 ^ £j - 2c aI c bJ c cI c dJ (ab\ - \cd) - c a ic b jc cJ c dI (ab\ - \dc) = 



I I, J a,b,c,d 

N/2 N/2 



2 ^2 £ i ^2 c aI c bJ c cI c d j{ab\ - \cd) 



I I, J a,b,c, d 

N/2 



2 il £ i- D ac [c]D bd [c}(ab\ - \cd) , (121) 



a,b,c, d 



where a convenient rearrangement of the indices in the two sums has been per- 
formed from the first to the second line. 



9 Introduction to Gaussian basis sets 

In principle, arbitrary functions may be chosen as the Xa to solve the Roothaan- 
Hall equations in the previous section, however, in eq. (|117p . we see that one of 
the main numerical bottlenecks in SCF calculations arises from the necessity of 
calculating the 0(M 4 ) four-centre integrals^ (ab\ £ \ cd), since the solution of the 
generalized eigenvalue problem in eq. (|116p typically scales only like 0(M 3 ), and 
there are 0(M 2 ) two-centre (a\ h \b) integrals (see however the next section). Either 
if these integrals are calculated at each iterative step and directly taken from RAM 
memory (direct SCF) or if they are calculated at the first step, written to disk, 
and then read from there when needed (conventional SCF), an appropriate choice 



1 If the symmetry properties of the integrals are used, the precise number of ERIs is found to be ^M(M - 
1)(M 2 +M + 2) [95]. 
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of the functions Xa m the finite basis set is essential if accurate results are sought, 
M is intended to be kept as small as possible and the integrals are wanted to be 
computed rapidly. When one moves into higher-level theoretical descriptions and 
the numerical complexity scales with M even more unpleasantly, the importance 
of this choice greatly increases. 

In this section, in order to support that study, we shall introduce some of the 
concepts involved in the interesting field of basis-set design. For further details not 
covered here, the reader may want to check refs. [18,19,96,97]. 

The only analytically solvable molecular problem in non-relativistic quantum 
mechanics is the hydrogen-like atom, i.e., the system formed by a nucleus of charge 
Z and only one electron (H, He + , Li 2+ , etc.). Therefore, it is not strange that all the 
thinking about atomic-centred basis sets in quantum chemistry is much influenced 
by the particular solution to this problem. 

The spatial eigenfunctions of the Hamiltonian operator of an hydrogen-like atom, 
in atomic units and spherical coordinates, reacQ 

- /(WSSi (f (¥ r ) • 

(122) 

where n, / and m are the energy, total angular momentum and z-angular mo- 
mentum quantum numbers, respectively. Their ranges of variation are coupled: all 
being integers, n runs from 1 to oo, I from to n — 1 and m from —I to I. The 
function L^}_-. is a generalized Laguerre polynomial [98], for which it suffices to 
say here that it is of order n — l — 1 (thus having, in general, n — l — 1 zeros) , and the 
function Yi m (9,<p) is a spherical harmonic, which is a simultaneous eigenfunction 
of the total angular momentum operator l 2 (with eigenvalue 1(1 + 1)) and of its 
z-component l z (with eigenvalue m). 

The hope that the one-electron orbitals that are the solutions of the Hartree-Fock 
problem in many-electron atoms could not be very different from the 4> n i m abova^l, 
together with the powerful chemical intuition that states that 'atoms- in- molecules 
are not very different from atoms-alone', is what mainly drives the choice of the 
functions \a in the basis set, and, in the end, the variational procedure that will 
be followed is expected to fix the largest failures coming from these too-simplistic 
assumptions. 

Hence, it is customary to choose functions that are centred at atomic nuclei and 
that partially resemble the exact solutions for hydrogen-like atoms. In this spirit, 
the first type of AOs to be tried [94] were the Slater-type orbitals (STOs), proposed 
by Slater [99] and Zener [100] in 1930: 

x r°(r;R aa ) :=A^ T °y^(^,^Jk-i?aJ"^ 1 exp(-Ca|r-i? Q j) , (123) 
where N~ TO is a normalization constant and £ a is an adjustable parameter. The 



1 For consistency with the rest of the text, the Born-Oppcnheimer approximation has been also assumed 
here. So that the reduced mass fi := m e M]y /(m e + Mjv) that should enter the expression is considered to 
be the mass of the electron fi ~ m e (recall that, in atomic units, m e = 1 and Mpf > 2000). 

2 Note that the JV-electron wavefunction of the exact ground-state of a non-hydrogen-like atom depends 
on 3N spatial variables in a way that cannot be written, in general, as a Slater determinant of one- 
electron functions. The image of single electrons occupying definite orbitals, together with the possibility 
of comparing them with the one-particle eigenfunctions of the Hamiltonian of hydrogen-like atoms, vanishes 
completely outside the Hartree-Fock formalism. 
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index a a is that of the nucleus at which the function is centred, and, of course, in 
the majority of cases, there will be several Xa T ° corresponding to different values of 
a centred at the same nucleus. The integers l a and m a can be considered^quantum 
numbers, since, due to the fact that the only angular dependence is in K C ™ (see 
below for a definition), the STO defined above is still a simultaneous eigenstate of 
the one-electron angular momentum operators 1 2 and l z (with the origin placed at 
Ra a )- The parameter n a , however, should be regarded as a 'principal (or energy) 
quantum number' only by analogy, since, on the one hand, it does not exist a 'one- 
atom Hamiltonian' whose exact eigenfunctions it could label and, on the other 
hand, only the leading term of the Laguerre polynomial in eq. (I122p has been kept 
in the STCB 

Additionally, in the above notation, the fact that Xa T ° parametrically depends 
on the position of a certain a -th nucleus has been stressed, and the functions 
^I'm ' wn i c h are called real spherical harmonics [101] (remember that we want to 
do real RHF), are defined in terms of the classical spherical harmonics Yi aTria by 



Yl S a m a (0a a , l Pa a ) 

where c stands for cosine, s for sine, the functions P™ a are the associated Leg- 
endre polynomials [98], and the spherical coordinates 9 aa and <p aa also carry the 
ct a -label to remind that the origin of coordinates in terms of which they are defined 
is located at R aa . Also note that, using that YJ C = Y^q, there is the same number 
of real spherical harmonics as of classical ones. 

These Xa TO have some good physical properties. Among them, we shall mention 
that, for \r—R aa | — ► 0, they present a cusp (a discontinuity in the radial derivative), 
as required by Kato's theorem [102]; and also that they decay at an exponential 
rate when |r — R aa \ — ► oo, which is consistent with the image that, an electron 
that is taken apart from the vicinity of the nucleus must 'see', at large distances, 
an unstructured point-like charge (see, for example, the STO in fig. [2]). Finally, the 
fact that they do not present radial nodes (due to the aforementioned absence of 
the non-leading terms of the Laguerre polynomial in eq. (|122p ) can be solved by 
making linear combinations of functions with different values of cJ3- 

Now, despite their being good theoretical candidates to expand the MO <\>j that 
make up the A^-particle solution of the Hartree-Fock problem, these STOs have se- 
rious computational drawbacks: Whereas the two-centre integrals (such as (o| h \ b) 
in eq. (|117p ) can be calculated analytically, the four-centre ERIs (ac\ 1/r \bd) can 
not [18, 94] if functions like the ones in eq. (|123l) are used. This fact, which was 
known as "the nightmare of the integrals" in the first days of computational quan- 
tum chemistry [94] , precludes the use of STOs in practical ab initio calculations of 
large molecules. 



Yl a m a + 17 



"' a oc P ; ma (cos 9 aa ) cos(m a ip aa 



V2 

Yl a m a — Y laTJ; 

' V2 



(124a) 



- cx P, m °(cos0 a J sin (m a </? a J , (124b) 



3 If we notice that, within the set of all possible STOs (as defined in eq. H123I I). every hydrogen-like energy 
eigenfunction (see eq. (| 1 22 1) ~) can be formed as a linear combination, we easily see that the STOs constitute 
a complete basis set. This is important to ensure that the Hartree-Fock limit could be actually approached 
by increasing M. 

1 This way of proceeding renders the choice of the exponent carried by the r — Rot a \ part (n a — 1 in the 
case of the STO in eq. I|123| a rather arbitrary one. As a consequence, different definitions may be found 
in the literature and the particular exponent chosen in actual calculations turns out to be mostly a matter 
of computational convenience. 
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A major step to overcome these difficulties that has revolutioned the whole field 
of quantum chemistry [55, 94] was the introduction of Cartesian Gaussian-type 
orbitals (cGTO): 



X f TO (r ;j R Q J : = 

Mf^O( r l_ R l a f( r 2_ R 2 ff r ,_^ ] rxp{ a \ r - R ^f) .,125) 



where the r p and the Ra a , with p = 1,2, 3, are the Euclidean coordinates of the 
electron and the a a -th nucleus respectively, and the integers l£, 1% and l£, which 
take values from to oo, are called orbital quantum number]^. 

Although these GTOs do not have the good physical properties of the STOs 
(compare, for example, the STO and the GTO in fig. [2]), in 1950, Boys [103] showed 
that all the integrals appearing in SCF theory could be calculated analytically if 
the Xa had the form in eq. (|125|> . The enormous computational advantage that 
this entails makes possible to use a much larger number of functions to expand the 
one-electron orbitals <f>i if GTOs are used, partially overcoming their bad short- 
and long-range behaviour and making the Gaussian-type orbitals the universally 
preferred choice in SCF calculations [18]. 

To remedy the fact that the angular behaviour of the Cartesian GTOs in eq. (I125P 
is somewhat hidden, they may be linearly combined to form Spherical Gaussian- 
type orbitals (sGTO): 



X f TO (r;R aa ) :=AAf TO y^(^,^Jk-^r°exp(-C a |r-i? Q J 2 ) , (126) 

which are proportional to the real spherical harmonic Y, '^ (0 aa , cp aa ) , and to 
which the same remarks made in footnote Q] in page 1451 for the STOs, regarding the 
exponent in the \r — R aa \ part, may be applied. 

The fine mathematical details about the linear combination that relates the 
Cartesian GTOs to the spherical ones are beyond the scope of this introduction. 
We refer the reader to refs. [104] and [101] for further information and remark here 
some points that will have interest in the subsequent discussion. 

First, the cGTOs that are combined to make up a sGTO must have all the 
same value of l a := + la + and, consequently, this sum of the three orbital 
quantum numbers l£ , la and l^ in a particular Cartesian GTO is typically (albeit 
dangerously) referred to as the angular momentum of the function. In addition, 
apart from the numerical value of l a , the spectroscopic notation is commonly used 
in the literature, so that cGTOs with l a = 0, 1, 2, 3, 4, 5, . . . are called s, p, d, f g, h, 
. . . , respectively. Where the first four come from the archaic words sharp, principal, 
diffuse and fundamental, while the subsequent ones proceed in alphabetical order. 

Second, for a given l a > 1, there are more Cartesian GTOs ((/ a +l)(/ a +2)/2) than 
spherical ones (2l a + 1), in such a way that, from the {l a + l)(l a + 2)/2 functionally 
independent linear combinations that can be formed using the cGTOs of angular 
momentum l a , only the angular part of 2l a + 1 of them turns out to be proportional 
to a real spherical harmonic Yf'^ (8 aa ,ip aa ); the rest of them are proportional to 
real spherical harmonic functions with a different value of the angular momentum 



1 Since the harmonic-oscillator energy eigenfunctions can be constructed as linear combinations of Carte- 
sian GTOs, we have that the latter constitute a complete basis set and, like in the case of the STOs, we 
may expect that the Hartree-Fock limit is approached as M is increased. 
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quantum number. For example, from the six different d-Cartesian GTOs, whose 
polynomial parts are x 2 , y 2 , z 2 , xy, xz and yz (using an evident, compact notation), 
only five different spherical GTOs can be constructed: the ones with polynomial 
parts proportional to 2z 2 — x 2 — y 2 , xz, yz, x 2 — y 2 and xy [104]. Among these new 
sGTOs, which, in turn, are proportional (neglecting also powers of r, see footnoted 

in page l4"5j) to the real spherical harmonics Y20 > ^21 > ^21 > ^22 an< ^ ^22 j the hnear 
combination x 2 + y 2 + z 2 is missing, since it presents the angular behaviour of an 
s-orbital (proportional to loo)- 

Finally, let us remark that, whereas Cartesian GTOs in eq. ()125|) are easier to 
be coded in computer applications than sGTOs [104], it is commonly accepted 
that these spurious spherical orbitals of lower angular momentum that appear 
when cGTOs are used do not constitute efficient choices to be included in a basis 
set [101] (after all, if we wanted an additional s-function, why include it in such 
an indirect and clumsy way instead of just designing a specific one that suits our 
particular needs?). Consequently, the most common practice in the field is to use 
Cartesian GTOs removing from the basis sets the linear combinations such as the 
x 2 + y 2 + z 2 above. 

Now, even if the integrals involving cGTOs can be computed analytically, there 
are still 0(M 4 ) of them in a SCF calculation. For example, in the model dipeptide 
HCO-L-Ala-NH2, which is commonly used to mimic an alanine residue in a protein 
[73,76,105,106], there are 62 electrons and henceforth 31 RHF spatial orbitals $1. 
If a basis set with only 31 functions is used (this is a lower bound that will be 
rarely reached in practical calculations due to symmetry issues, see below), near 
a million of four-centre (ac\ 1/r \bd) integrals must be computed. This is why, one 
must use the freedom that remains once the decision of sticking to cGTOs has 
been taken (namely, the choice of the exponents d and the angular momentum l a ) 
to design basis sets that account for the relevant behaviour of the systems studied 
while keeping M below the 'pain threshold'. 

The work by Nobel Prize John Pople's group has been a major reference in this 
discipline, and their STO-nG family [107], together with the split- valence Gaussian 
basis sets, 3-21G, 4-31G, 6-31G, etc. [108-115], shall be used here to exemplify 
some relevant issues. However, note that most of the concepts introduced are also 
applicable to more modern basis sets, such as those by Dunning [116]. 

To begin with, let us recall that the short- and long-range behaviour of the Slater- 
type orbitals in eq. f)123j) is better than that of the more computationally efficient 
GTOs. In order to improve the physical properties of the latter, it is customary 
to linearly combine M a Cartesian GTOs, denoted now by ^a (^ = 1, ...,M ), 
and termed primitive Gaussian-type orbitals (PGTO), having the same atomic 
centre Ra a , the same set of orbital quantum numbers, l£ , la and 1*, but different 
exponents £<f , to make up a contracted Gaussian-type orbitals (CGTO), defined by 



Ma 

Xa(r;R aa ) :=^2g^(r;R aa ) = 

l* ly i' M " 

(r 1 - <) " (r 2 - <) ° (r 3 - <) ' ^g^exp ( - \r - R a f) ,(127) 

where the normalization constants Ma have been kept inside the sum because 
they typically depend on £<f . Also, we denote now by Mq the number of contracted 
GTOs and, by Mp := ^ a M a , the number of primitive ones. 
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contracted GTO 
primitive GTOs 
STO 
single GTO 




12 3 4 

r a 

Figure 2. Radial behaviour of the ls-contracted GTO of the hydrogen atom in the STO-3G basis 
set [107], the three primitive GTOs that form it, the STO that is meant to be approximated and a single 
GTO with the same norm and exponent as the STO. The notation r a is shorthand for the distance to 

the Q a -th nucleus |r — Ra a ■ 

In the STO-reG family of basis sets, for example, n primitive GTOs are used for 
each contracted one, fitting the coefficients ga and the exponents Qa to resemble the 
radial behavior of Slater-type orbitals [107]. In fig. [H the ls-contracted GTO (see 
the discussion below) of the hydrogen atom in the STO-3G basis set is depicted, 
together with the three primitive GTOs that form it and the STO that is meant to 
be approximated!]]. We can see that the contracted GTO has a very similar behavior 
to the STO in a wide range of distances, while the single GTO that is also shown 
in the figure (with the same norm and exponent as the STO) has not. 

Typically, the fitting procedure that leads to contracted GTOs is performed on 
isolated atoms and, then, the already mentioned chemical intuition that states 
that 'atoms- in- molecules are not very different from atoms-alone' is invoked to 
keep the linear combinations fixed from there on. Obviously, better results would 
be obtained if the contraction coefficients were allowed to vary. Moreover, the 
number of four-centre integrals that need to be calculated depends on the number 
of primitive GTOs (like O(Mp)), so that we have not gained anything on this 
point by contracting. However, the size of the variational space is Mq (i-e., the 
number of contracted GTOs), in such a way that, once the integrals {ac\ 1/r \bd) are 
calculated (for non-direct SCF), all subsequent steps in the iterative self-consistent 
procedure scale as powers of Mc- Also, the disk storage (again, for non-direct 
schemes) depends on the number of contracted GTOs and, frequently, it is the disk 
storage and not the CPU time the limiting factor of a calculation. 

An additional chemical concept that is usually defined in this context and that 
is needed to continue with the discussion is that of shell: Atomic shells in quantum 
chemistry are defined analogously to those of the hydrogen atom, so that each elec- 
tron is regarded as 'filling' the multi-electron atom 'orbitals' according to Hund's 



Basis sets were obtained from the Extensible Computational Chemistry Environment Basis Set Database 
at http://www.emsl.pnl.gov/forms/basisform.html, Version 02/25/04, as developed and distributed by 
the Molecular Science Computing Facility, Environmental and Molecular Sciences Laboratory which is part 
of the Pacific Northwest Laboratory, P.O. Box 999, Richland, Washington 99352, USA, and funded by the 
U.S. Department of Energy. The Pacific Northwest Laboratory is a multi-program laboratory operated 
by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC06-76RLO 1830. 
Contact Karen Schuchardt for further information. 
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Tabic 3. Exponents Q£ and contraction coefficients g£ of the primitive Gaussian shells 
that make up the three different contracted ones in the STO-3G basis set for carbon (see 
rcf. [107] and footnote[^in pagc [48| l. The exponents of the 2s- and 2p-shclls arc constrained 
to be the same. 



ls-shell 




2sp-shell 




/■» 


aS 




git (s) 


9a M (P) 


71.6168370 


0.15432897 


2.9412494 


0.15591627 


-0.09996723 


13.0450960 


0.53532814 


0.6834831 


0.60768372 


0.39951283 


3.5305122 


0.44463454 


0.2222899 


0.39195739 


0.70115470 



Table 4. Exponents Q^" and contraction coefficients gj* of the primitive Gaussian shells 
that make up the three different constrained ones in the 6-31G basis set for carbon (see 
ref. [109] and footnote in page [48j. In the ls-shell, there is only one contracted 
Gaussian shell made by six primitive ones, whereas, in the 2s- and 2p-valcnce shells, 
there are two CGSs, one of them made by three PGSs and the other one only by a 
single PGS. The exponents of the 2s- and 2p-shclls arc constrained to be the same. 



ls-shell 




2sp-shell 




AM 


gZ 




sS (p) 


bS (?) 


3047.52490 


0.0018347 


7.8682724 


-0.1193324 


0.0689991 


457.36951 


0.0140373 


1.8812885 


-0.1608542 


0.3164240 


103.94869 


0.0688426 


0.5442493 


1.1434564 


0.7443083 


29.21015 


0.2321844 








9.286663 


0.4679413 


0.1687144 


1.0000000 


1.0000000 


3.163927 


0.3623120 









rules [117]. Hence, the occupied shells of carbon, for example, are denned to be Is, 
2s and 2p, whereas those of, say, silicon, would be Is, 2s, 2p, 3s an 3p. Each shell 
may contain 2(2/ + 1) electrons if complete, where 21 + 1 accounts for the orbital 
angular momentum multiplicity and the 2 factor for that of electron spin. 

Thus, using these definitions, all the basis sets in the aforementioned STO-nG 
family are minimal; in the sense that they are made up of only 21 + 1 contracted 
GTOs for each completely or partially occupied shell, so that the STO-nG basis 
sets for carbon, for example, contain two s-type contracted GTOs (one for the ls- 
and the other for the 2s-shell) and three p-type ones (belonging to the 2p-shell). 
Moreover, due to rotational-symmetry arguments in the isolated atoms, all the 
21 + 1 functions in a given shell are chosen to have the same exponents and the 
same contraction coefficients, differing only on the polynomial that multiplies the 
Gaussian part. Such 21 + 1 CGTOs shall be said to constitute a Gaussian shell 
(GS), and we shall also distinguish between the primitive (PGS) and contracted 
(CGS) versions. 

In table [3] the exponents £<f and the contraction coefficients g£ of the primitive 
GTOs that make up the three different shells in the STO-3G basis set for carbon 
are presented (see ref. [107] and footnoted] in page [48]). The fact that the exponents 
Ca in the 2s- and 2p-shells are constrained to be the same is a particularity of some 
basis sets (like this one) which saves some computational effort and deserves no 
further attention. 

Next, let us introduce a common notation that is used to describe the contraction 
scheme: It reads (primitive shells) / [contracted shells], or alternatively (primitive 
shells) — ► [contracted shells]. According to it, the STO-3G basis set for carbon, 
for example, is denoted as (6s, 3p) — ► [2s, lp], or (6,3) — > [2,1]. Moreover, since for 
organic molecules it is frequent to have only hydrogens and the lst-row atoms C, 
N and O (whose occupied shells are identical^, the notation is typically extended 
and the two groups of shells are separated by a slash; as in (6s,3p/3s) — ► [2s,lp/ls] 
for STO-3G. 



In proteins, one may also have sulphur in cysteine and methionine residues. 
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The first improvement that can be implemented on a minimal basis set such as 
the ones in the STO-nG is the splitting, which consists in including more than one 
Gaussian shell for each occupied one. If the splitting is evenly performed, i.e., each 
shell has the same number of GSs, then the basis set is called double zeta (DZ), 
triple zeta (TZ), quadruple zeta (QZ), quintuple zeta (5Z), sextuple zeta (6Z), and 
so on; where the word zeta comes from the Greek letter £ used for the exponents. 
A hypothetical TZ basis set in which each CGTO is made by, say, four primitive 
GTOs, would read (24s,12p/12s) — > [6s,3p/3s] in the aforementioned notation. 

At this point, the already familiar intuition that says that 'atoms- in- molecules are 
not very different from atoms-alone' must be refined with another bit of chemical 
experience and qualified by noticing that 'core electrons are less affected by the 
molecular environment and the formation of bonds than valence electrons 0. In this 
spirit, the above evenness among different shells is typically broken, and distinct 
basis elements are used for the energetically lowest lying (core) shells than for the 
highest lying (valence) ones. 

On one side, the contraction scheme may be different. In which case, the notation 
used up to now becomes ambiguous, since, for example, the designation (6s,3p/3s) 
— ► [2s,lp/ls], that was said to correspond to STO-3G, would be identical for a 
different basis set in which the ls-Gaussian shell of heavy atoms be formed by 4 
PGSs and the 2s-Gaussian shell by 2 PGSs (in lst-row atoms, the 2s- and 2p-shells 
are defined as valence and the ls-one as core, while in hydrogen atoms, the ls-shell 
is a valence one). This problem can be solved by explicitly indicating how many 
primitive GSs form each contracted one, so that, for example, the STO-3G basis set 
is denoted by (33,3/3) — ► [2,1/1], while the other one mentioned would be (42,3/3) 
— ► [2,1/1] (we have chosen to omit the angular momentum labels this time). 

The other point at which the core and valence Gaussian shells may differ is in 
their respective 'zeta quality', i.e., the basis set may contain a different number of 
contracted Gaussian shells in each case. For example, it is very common to use a 
single CGS for the core shells and a multiple splitting for the valence ones. These 
type of basis sets are called split-valence and the way of naming their quality is 
the same as before, except for the fact that a capital V, standing for valence, is 
added either at the beginning or at the end of the acronyms DZ, TZ, QZ, etc., thus 
becoming VDZ, VTZ, VQZ, etc. or DZV, TZV, QZV, etc. 

Pople's 3-21G [113], 4-31G [108], 6-31G [109] and 6-311G [112] are well-known 
examples of split-valence basis sets that are commonly used for SCF calculations in 
organic molecules and that present the two characteristics discussed above. Their 
names indicate the contraction scheme, in such a way that the number before the 
dash represents how many primitive GSs form the single contracted GSs that is 
used for core shells, and the numbers after the dash how the valence shells are 
contracted, in much the same way as the notation in the previous paragraphs. 
For example, the 6-31G basis set (see tabled]), contains one CGS made up of six 
primitive GSs in the ls-core shell of heavy atoms (the 6 before the dash) and two 
CGS, formed by three and one PGSs respectively, in the 2s- and 2p-valence shells 
of heavy atoms and in the ls-shell of hydrogens (the 31 after the dash). The 6-311G 
basis set, in turn, is just the same but with an additional single-primitive Gaussian 
shell of functions in the valence region. Finally, to fix the concepts discussed, let 
us mention that, using the notation introduced above, these two basis sets may be 
written as (631,31/31) -► [3,2/2] and (6311,311/311) [4,3/3], respectively. 

Two further improvements that are typically used and that may also be incorpo- 



1 Recall that, for the very concept of 'core' or 'valence electrons' (actually for any label applied to a single 
electron) to have any sense, we must be in the Hartree-Fock formalism (see footnote [2] in page 144 \ , 
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rated to Pople's split-valence basis sets are the addition of polarization [110, 111] 
or diffuse functions [111,114,115]. We shall discuss them both to close both this 
section and the work. 

Up to now, neither the contraction nor the splitting involved GTOs of larger 
angular momentum than the largest one among the occupied shells. However, the 
molecular environment is highly anisotropic and, for most practical applications, 
it turns out to be convenient to add these polarization (large angular momentum) 
Gaussian shells to the basis set, since they present lower symmetry than the GSs 
discussed in the preceding paragraphs. Typically, the polarization shells are single- 
primitive GSs and they are denoted by adding a capital P to the end of the previous 
acronyms, resulting into, for example, DZP, TZP, VQZP, etc., or, say, DZ2P, TZ3P, 
VQZ4P if more than one polarization shell is added. In the case of Pople's basis 
sets [110,111], these improvements are denoted by specifying, in brackets and after 
the letter G, the number and type of the polarization shells separating heavy atoms 
and hydrogens by a commaJJ- For example, the basis set 6-31G(2df,p) contains 
the same Gaussian shells as the original 6-3 1G plus two d-type shells and one f- 
type shell centred at the heavy atoms, as well as one p-type shell centred at the 
hydrogens. 

Finally, for calculations in charged species (specially anions), where the charge 
density extends further in space and the tails of the distribution are more important 
to account for the relevant behaviour of the system, it is common to augment the 
basis sets with diffuse functions, i.e., single-primitive Gaussian shells of the same 
angular momentum as some preexisting one but with a smaller exponent £ than 
the smallest one in the shell. In general, this improvement is commonly denoted by 
adding the prefix aug- to the name of the basis set. In the case of Pople's basis sets, 
on the other hand, the insertion of a plus sign '+' between the contraction scheme 
and the letter G denotes that the set contains one diffuse function in the 2s- and 
2p-valence shells of heavy atoms. A second + indicates that there is another one in 
the ls-shell of hydrogens. For example, one may have the doubly augmented (and 
doubly polarized) 6-31++G(2d,2p) basis set. 

10 Modern developments: An introduction to linear-scaling methods 

The ground-breaking advances reviewed in the previous sections allow to routinely 
calculate, using Hartree-Fock SCF methods, physical properties of molecules of tens 
of atoms in present day computers. However, the simplest algorithms that can be 
devised to perform the limiting steps in such calculations are far from optimal. This 
large room for improvement, which may be enough to accommodate the exciting 
possibility of linearly scaling approaches that could open the door to thousands 
atoms computations, has been steering many innovative lines of research in the 
last years 

To close this review, we shall briefly outline here some of the hottest areas of 
modern development related to the topics discussed, specially those aimed to the 
reduction of the in principle 0(M 4 ) complexity associated to the calculation of the 
(ac\ 1/r \bd) ERIs in eq. (|118H . as well as the 0(M 3 ) cost of the diagonalization of 
the Fock operator in eq. (|116jrl . For wider reviews on these topics, we recommend 



1 There also exists an old notation for the addition of a single polarization shell per atom that reads 
6-31G** and that is equivalent to 6-31G(d,p). 

2 In principle, careful distinction must be made between the number of primitive GTOs Mp and the 
number of contracted ones Mq (see the previous section). However, since in this section we will be dealing 
only with approximate scalings without worrying much about the prefactor, the 'neutral' notation M has 
been chosen to denote a quantity which is linear on both Mp and Mq- 
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to the interested reader the accounts in refs. [118-120]. 

The first class of attempts to reduce the cost associated to the calculation of the 
0(M 4 ) ERIs (ac\ 1/r \bd) are those aimed to the improvement of the algorithms 
for analytically calculating them without approximations. There are basically two 
issues that render the construction of these algorithms a non trivial task: First, 
the fact that the only four-center ERIs that can be straightforwardly computed 
are the ones corresponding to a product of four s-type GTOs, while higher angular 
momentum ERIs may obtained from them in a non unique way. 

After using the Gaussian product rule [19] (see the previous section for the no- 
tation used) 
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which allows the four-centre integral to be turned into a two-center one, and 
whose absence for the case of STOs is the essential reason for their being non 
practical, we can turn the 6-dimensional ERI into a simple one-dimensional integral 
that can be readily calculated by different means [121,122]: 
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Such an integral is called a fundamental ERI and, as we said, ERIs involving 
GTOs with higher angular momentum than I = are obtained from this fun- 
damental one through iterative differentiations of eq. (|129p with respect to the 
nuclear positions, leading to recurrence relations expressing ERIs of a given angu- 
lar momentum as a function of the lower angular momentum ones. The particular 
flavour of these recurrence relations that is used is one of the matters in which the 
algorithms for calculating ERIs differ. 

The second issue that renders the construction of algorithms for computing ERIs 
non-trivial is related to how the contraction of primitive GTOs is handled. In the 
geminal paper by Boys [103], the most naive procedure was suggested, namely, 
the conversion of each ERI of contracted GTOs into a quadruple sum of ERIs 
of primitive GTOs. However, this does not take profit, for example, from the fact, 
mentioned in the previous section, that all CGTOs in the same contracted Gaussian 
shell are formed by PGTOs with the same set of exponents Ca- Much profit can be 
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taken from this and other constraints and, in fact, the cost-scaling profile of each 
algorithm with the contraction degree M a of the CGTOs is strongly correlated to 
the moment at which the transformation between CGTOs and PGTOs is performed 
[122]. 

Among the most used of these analytical algorithms, we can mention the Pople- 
Hehre (PH) one [123], which additionally exploits the fact that for each four- 
center ERI of low angular momentum there is a privileged Cartesian axis system 
in which many primitive integrals vanish by symmetry; the McMurchie- Davidson 
(MD) approach [124], which avoids the rotation in PH thus being more efficient 
for high angular momentum ERIs; the Obara-Saika-Schlegel (OSS) algorithm [125, 
126]; and a better defined and improved version of it: the Head-Gordon-Pople 
(HGP) algorithm [127]. Finally, if the moment at which the contraction is handled 
is chosen dynamically depending on the type of GTO appearing in the ERI, we 
have the PRISM modifications of MD and HGP: the MD-PRISM [121, 128, 129] 
and HGP-PRISM [122] algorithms, as well as a generalization of all the previous 
methods, called COLD PRISM [131]. 

Now, even if we implement any of these efficient methods for calculating the ERIs, 
there is still 0(M 4 ) of them. This scaling is simply too harsh for a too large class 
of applications. Therefore, the next natural step is to try to devise approximate 
methods that minimize the necessary decrease in accuracy at the same time that 
maximize the savings in computer time. Thanks to the particular characteristics of 
the ERIs, the Gaussian basis functions and the concrete physical problem intended 
to solvdU, this endeavour has been successfully pursued by many researchers and 
the 'holy grail' [18] of linear scaling with M is asymptotically approaching current 
calculations in large systems. Here, we shall discuss the basic issues that make 
this possible. For more in depth reviews, we point the readers to the accounts in 
references [119,120,132]. 

The first point to consider in order to reduce the 0(M ) scaling attains the so- 
called radial overlap. If we take a look at eq. (|127p . we can see that, irrespective of 
the polynomial prefactor which contains all the angular dependence of the GTO, 
every function Xa( r ) contains a radial exponential part (actually, a sum of expo- 
nentials). Therefore, if we consider any product Xa( r )Xb( r ) of two GTOs, we will 
always find a multiplying sum of terms such as those depicted in the expression 
for the Gaussian product rule in fll28j) . The exponential decay of all quantities £ a b 
in those terms with the distance \R aa ~ R<x b \ between the nuclei on which the 
GTOs are centred indicates that, among the M(M + l)/2 possible pair products 
Xa( r )Xb( r ), only O(M) of them will be non-negligible. To see this, note that if we 
fix (say) a, the only values of b that will yield a non-negligible product Xa(f)Xb( r ) 
are those for which \R aa — Ra b \ is 'small'. Since atoms do not interpenetrate in 
most of the conformations that shall be studied, this can only happen for a number 
of different 6's which is not O(M) but a constant independent of the size of the 
molecule. There are M different possible values of a for which the above reasoning 
can be repeated, and the result follows. 

As a consequence, if only O(M) pairs Xa( r )Xb( r ) are non-negligible, then only 
0(M 2 ) ERIs (ac\ 1/r \bd) may in principle contribute in a significant way to the 
Fock operator in eq. (|117p and not 0(M 4 ). (For similar estimations based on 
slightly different hypotheses, see [134,135].) One must also note that the discussion 
is complicated by the fact that the ERIs do not appear alone, but contracted with 
the density matrix elements D C( i- This allows for further improvements of the scaling 



1 The locality of many-electron Quantum Mechanics [132], which is related to the nearsightedness concept 
introduced by Kohn [133], is the main property that allows to finally achieve linearity. 
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beyond 0{M 2 ) which are discussed later in this section. 

Now, knowing that most of the ERIs are too small to be relevant, we do not 
know which ones and, if we calculated the 0(M 4 ) of them in order to spot the 
little ones, then we would have not gained anything. This simple argument shows 
the necessity of finding a set of estimators that allow us to selectively drop ERIs 
without calculating them. Of course, the number of estimators must also scale at 
worst like 0(M 2 ) in order for the scheme to be useful. 

One of the first and simplest such estimators, was introduced by Almlof et al. at 
the same time that they proposed the Direct SCF method [136]. In their scheme, 
each ERI (ac\ 1/r \bd) was approximated by the corresponding radial overlap factor 
£ab£cd (note that there are only 0(M 2 ) numbers £ a b)- Although this estimator was 
relatively successful, it presented the important drawback of not being an upper- 
bound for the ERIs, thus rendering the control of errors an a priori impossible 
task. In order to overcome this problem, Haser and Ahlrichs [137] later proposed a 
different estimator which can be assured to be always greater than the associated 
ERI. Following them, after the proof of positive definiteness by Roothaan [79], the 
electrostatic interaction energy between two continuous charge distributions, 

J \r — r'\ 

can be easily shown to satisfy the properties of an inner product. Hence, if we 
choose pi := XaXb, and pi := XcXd, we can use the well-known Schwarz inequality 
to show that 

(ac\l\bd)= [Pp£^ldrdr'< 
r J \r — r'\ 
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In such a way that, by calculating only the M{M + 1) /2 different two-index ERIs 
in the last term, we can safely bound from above the whole set containing 0(M 4 ) 
of them. 

After these seminal works, more sophisticated and tighter bounds have been 
developed through the years [138-140], their application being nowadays routine 
in Quantum Chemistry packages. 

In a second generation of methods, the 0(M 2 ) formal scaling achieved in practical 
calculations [141] by using the above ideas has been recently attacked. To this end, 
more specifically physical properties of the problem are exploited (see footnote [Din 
page [53]), and different strategies are used to deal with the Coulomb and exchange 
parts of the Fock operator in (|117j) . The main difference between the behaviours of 
these two contributions lies in the way in which the ERIs are contracted with the 
density matrix D ca i: In the 'classical' Coulomb part, the relative sizes of the ERIs 
(ac\ 1/r \bd) are largely correlated with the relative sizes of the associated elements 
D c d [142] , so that no decrease in the ~ M 2 scaling is expected from density matrix 
considerations. Differently, in the exchange terms, the fact that the elements D c d 
couple 'exchanged' indices in the {ac\ 1/r \db) ERIs produces cancellations which 
make this type of 'quantum' contributions rather short-range (for non-metallic 
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species) [142-144]. Hence, as for every short-range interaction, a number of terms 
scaling linearly with the system size is expected. Apart from the different treatment 
that this difference suggests, note that the separation of the Coulomb and exchange 
tasks allows for improved parallelization the computer codes [119]. 

For the exchange part, the aforementioned short-range behaviour allows to devise 
O(M) algorithms just by intelligently ordering the loops in which the ERIs are 
calculated. The fine details of these methods are rather technical and they are 
beyond the scope of this review. We point the reader to [144-147] and references 
therein for further information. 

In the Coulomb case, on the other hand, 0(M 2 ) terms still enter the sum in (|117|) . 
and more physically-based approximations must be used. The continuous fast 
multipole method (CFMM) by White et al. [148], for example, is probably one 
of the most celebrated algorithms for calculating the Coulomb part of the Fock op- 
erator. It is a generalization for continuous charge distributions of the fast multipole 
method (FMM), introduced by Greengard and Rokhlin [149] in a ground-breaking 
paper and aimed for point-like charges. In both FMM and CFMM, a clever hier- 
archical tree-like division of the space into cells is performed^, and the far away 
regions are approximated via truncated multipole expansions. 

These two ingredients, which allow to calculate the Coulomb contribution in 
0{M) steps for large systems, are common to most of the fast Coulomb algorithms^]. 
Despite this similarity, the room for improvement seems still large enough to accom- 
modate a vigorous field with many publications appearing each year. Let us men- 
tion here, for example, the generalized cell multipole method (GCMM) by Kutteh et 
al., which can use moments higher than monopole [151]; the quantum chemical tree 
code (QCTC) by Challacombe et al. [152], which independently thresholds 'bra' 
and 'kef distributions; and the Gaussian very fast multipole method (GvFMM) by 
Strain et al. [153], which benefits from the idea, introduced in [154] for point-like 
charges, of using a dynamical maximum angular momentum to further speed up 
the calculations 

Note however, that the near-field contributions in these methods are still calcu- 
lated without approximations and represent a great portion of the computer time. 
In this line, some modern algorithms are appearing to alleviate this part of the 
work, such as, for example, the J matrix engine by White and Head-Gordon [155], 
which uses and improves the ideas discussed in the first part of this section about 
analytically calculating the ERIs; or the method by Izmaylov et al. [156], which im- 
plements a hierarchy of screening levels to eliminate negligible integrals. According 
to recent reports [157], the combination of CFMM, with the J matrix engine tech- 
nique and with the Fourier transform Coulomb (FTC) method by Fiisti-Molnar 
and Pulay [158] is nowadays probably the fastest way for assembling the Coulomb 
matrix. 

Now, once the construction process of the whole Fock matrix F a b (via its Coulomb 
and exchange parts) has been cast into the form of an O(M) algorithm, the im- 
portance of the second rate-limiting step in SCF procedures, the diagonalization 
of F a b, comes into focus. Although the prefactor of the (in principle) 0(M 3 ) diago- 
nalization step is very small and, for systems of less than a few thousands of atoms, 
the absolute time spent on it is smaller than the one needed for the formation of 
the Fock matrix [132,141,157,159], it is clear that, in the long run, it will dominate 
the calculations in larger systems and will become the relevant bottleneck [159]. 

We shall point out the essentials regarding the methods aimed to reduce the 



1 Basically, a truncation of a Barnes and Hut (BH) tree [150]. 

2 For a review of different approaches, see [132]. 
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scaling the diagonalization step. For more in depth reviews on the topic, we suggest 
to the reader the accounts in [132, 147, 159, 160]. 

The 0{M ?y ) complexity of classical diagonalization methods (such as the Givens- 
Householder one [161]) can be easily understood if we think that the core of the 
algorithm is just the multiplication of M x M matrices. Nevertheless, if the matrices 
multiplied are sparse, i.e., they have a number of non-negligible elements that scale 
not as 0(M 2 ) but as 0(M), then the product can be obtained in O(M) steps. 
As a result of the already mentioned locality properties of many-body Quantum 
Mechanics (see footnote[T]in page !53p . some matrices appearing in the SCF methods 
discussed in this review, namely, the density matrix D ao and the Fock operator F ao , 
are indeed sparse if the system is non-metallic, i.e., if it has a non- vanishing HOMO- 
LUMO gap [159,162]. The idea behind most of the modern algorithms for achieving 
(or avoiding) diagonalization with O(M) effort consists essentially in performing 
all operations using only these (local) sparse matrices, and avoiding (non-local) 
dense ones, such as the MO coefficients matrix c ao . 

The attempts to improve the scaling of the diagonalization steps fall basically 
in two groups [163]. In the first one, profit is taken from the use of MOs which, 
instead of being extended over the whole molecule, such as the canonical orbitals 
used in the previous sections, are localized in a small region of space. In these class 
of methods, diagonalization [164] or pseudo-diagonalization (annihilation of the 
occupied-virtual blocks of the Fock matrix) [165, 166] is still performed, and the 
0{M) scaling is achieved because the representation of all operators in the basis 
of localized MOs is sparse. The second group of algorithms do not use the MOs as 
variables but the density matrix itself. Among them, two subfamilies of methods 
may be found: In the first one, the search for the optimal density matrix is simply 
treated as a standard optimization problem, being the score function the HF energy, 
and the variables the density matrix elements or a set of parameters of some suitable 
truncated expansion of it [171]. See, for example, the approaches by Ochsenfeld 
and Head-Gordon [167], by Salek et al. [168], by Millam and Scuseria [169], or 
by Ordejon et al. [170]. The other subfamily of density matrix-based methods 
use iterative procedures, in such a way that, at each step, the Fock operator is 
considered fixed and the equations are solved for the density matrix (much in 
the spirit of the MOs-based algorithms discussed in the previous sections) . In this 
group, we can find, for example, the approach in [171] using the Lanczos algorithm 
[172,173], or the method by Helgaker et al. [174]. 

These algorithms, combined with new strategies that also avoid diagonalization 
and improve SCF convergence properties, such as the one described in [175], rep- 
resent the final step towards linear Hartree-Fock methods in Quantum Chemistry. 

To close this section, although we have been concerned, up to now, with the 
calculation of the electronic ground-state given a fixed position of the nuclei, let 
us stress that it is also very common to use quantum chemical methods for finding 
the local energy minima of molecules. To this end, geometry optimizations must be 
performed and not only must we be able to compute the energy of the molecule, but 
also their derivatives with respect to the nuclear coordinate^. Additionally, these 
derivatives are also needed to do ab initio molecular dynamics in the ground-state 
Born-Oppenheimer PES [176]. 

The most naive approach, namely, the computation of the gradient of the energy 



1 Monte Carlo methods, in which the derivatives of the energy function are not needed, could also be used. 
However, although they are efficient (and often the only choice) for global optimization problems, most of 
the minimizations performed in Quantum Chemistry aim only for the closest local minimum. In such a case, 
methods which do need the derivatives, such as Newton-Raphson, steepest-descent or conjugate-gradient, 
usually perform better. 
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E(R) (using a simpler notation for it than V^ S (R) in (fTUj) ) by finite differences, is 
very inefficient for anything but the smallest molecules. To see this, one just need to 
notice that the gradient has as many components as the system degrees of freedom 
n. Hence, in order to obtain it, say, at a point R , we would have to compute n + 1 
times a single point energy, in order to know E(R ) and E(R + ARJ, being ARj, 
with i = 1, . . . , n, a small displacement in each of the nuclear degrees of freedom. 

This drawback was overcome in the late 60s by Pulay and others (see [177, 178] 
and references therein) with the introduction of the so-called analytical derivatives, 
in which the gradient (and higher-order derivatives) are expressed, like the energy 
itself, just as a function of ERIs involving the wavefunction at the point R . This 
marked an inflexion point in the development of optimization and molecular dy- 
namics algorithms that continues nowadays, as analytical derivatives are routinely 
introduced together with almost any new method for calculating the energy. In 
relation to the improvements reviewed in this work, for example, let us note that, 
in [179], the extension of the J matrix engine method to calculate the derivatives of 
the Coulomb part with respect of the nuclei coordinates is introduced; in [147], lin- 
ear scaling exchange gradients are developed; analytic derivatives for the GvFMM 
are provided in [180]; the HGP algorithm for calculating the ERIs is extended to 
the computation of derivatives of the ERIs as well in [127]; and we may find similar 
developments for the FTC method [157] or for algorithms that achieve diagonal- 
ization with linear effort [167,170]. 
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Appendix A: Functional derivatives 

A functional J 7 ^] is a mapping that takes functions to numbers (in this work, only 
functionals in the real numbers are considered): 



For example, if the function space Q is the Hilbert space of square-integrable 
functions L 2 (the space of states of quantum mechanics), the objects in the domain 
of T (i.e., the functions in L 2 ) can be described by infinite-tuples (ci,C2,...) of 
complex numbers and T may be pictured as a function of infinite variables. 

When dealing with function spaces Q that meet certain requirements^, the limit 
on the left-hand side of the following equation can be written as the integral on 
the right-hand side: 



1 We will not discuss the issue further but let it suffice to say that L 2 does satisfy these requirements. 



T: Q 



R 




(Al) 
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where x denotes a point in the domain of the functions in Q, and the the object 
(5J-[fyo]/5fy)(x) (which is a function of x not necessarily belonging to Q) is called 
the functional derivative of J~\^\ in the in the point ^>q. 

One common use of this functional derivative is to find stationary points of 
functionals. A function is said to be an stationary point of J-\^\ if: 



In order to render this definition operative, one must have a method for comput- 
ing (8J r [^o\/S'9)(x). Interestingly, it is possible, in many useful cases (and in all 
the applications of the formalism in this work), to calculate the sought derivative 
directly from the left-hand side of eq. (|Aip . The procedure, in such a situation, 
begins by writing out .Ff^o + £<5 , 3 / ] and clearly separating the different orders in 
e. Secondly, one drops the terms of zero order (by virtue of the subtraction of the 
quantity F^o]) and those of second order or higher (because they vanish when 
divided by e and the limit e —* is taken). The remaining terms, all of order one, 
are divided by e and, finally, (5J : [ 1 ^o]/5^)(x) is identified out of the resulting ex- 
pression (which must written in the form of the right-hand side of eq. (|A1|) . For a 
practical example of this process, see sees. [6] and [7J 



Appendix B: Lagrange multipliers and constrained stationary points 

Very often, when looking for the stationary points of a function (or a functional), 
the search space is not the whole one, in which the derivatives are taken, but 
a certain subset of it defined by a number of constraints. An elegant and useful 
method for solving the constrained problem is that of the Lagrange multipliers. 

Although it can be formally generalized to infinite dimensions (i.e., to functionals, 
see appendix A), here we will introduce the method in IR N in order to gain some 
geometrical insight and intuition. 

The general framework may be described as follows: we have a differentiable 
function f(x) that takes points in R N to real numbers and we want to find the 
stationary points of / restricted to a certain subspace £ of IR N , which is defined by 
K constraint^ 



Li{x) = i = l,...,K. (Bl) 

The points that are the solution of the constrained problem are those x belonging 
to X where the first order variation of / would be zero if the derivatives were taken 
'along' E. In other words, the points x where the gradient V/ has only components 
(if any) in directions that 'leave' S (see below for a rigorous formalization of these 
intuitive ideas). Thus, when comparing the solutions of the unconstrained problem 
to the ones of the constrained problem, three distinct situations arise (see fig. IB1|) : 

(i) A point a; is a solution of the unconstrained problem (i.e. it satisfies V/(oj) = 0) 
but it does not belong to S. Hence, it is not a solution of the constrained 
problem. This type of point is depicted as a white-filled circle in fig. IB11 



1 If the constraints are functionally independent, one must also ask that K < TV. If not, E will be either a 
point (if K = TV) or empty (if K > TV). 
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v/ 



Figure Bl. Schematic depiction of a constrained stationary points problem. E is the 2-dimensional 
search space, which is embedded in R 3 . The white- filled circles are solutions of the unconstrained 
problem only, the gray-filled circles are solutions of only the constrained one and the gray-filled circles 
inside white-filled circles are solutions of both. A, B, C and D are examples of different situations 

discussed in the text. 

(ii) A point x is a solution of the unconstrained problem (i.e. it satisfies V fix) = 0) 
and it belongs to E. Hence, it is also a solution of the constrained problem, since, 
in particular, the components of the gradient in directions that do not leave E 
are zero. This type of point is depicted as a gray-filled circle inside a white-filled 
circle in fig. IB11 

(iii) A point x is not a solution of the unconstrained problem (i.e., one has 
V/(a:) ^ 0) but it belongs to E and the only non-zero components of the gra- 
dient are in directions that leave E. Hence, it is a solution of the constrained 
problem. This type of point is depicted as a gray-filled circle in fig. IB1I 

From this discussion, it can be seen that, in principle, no conclusions about the 
number (or existence) of solutions of the constrained problem may be drawn only 
from the number of solutions of the unconstrained one. This must be investigated 
for each particular situation. 

In fig. IBll an schematic example in IR 3 is depicted. The constrained search space 
E is a 2-dimensional surface and the direction^ in which one leaves E is shown 
at several points as a perpendicular vector p s . In such a case, the criterium that 
V/ has only components in the direction of leaving E may be rephrased by asking 
V/ to be parallel to p s , i.e., by requiring that there exists a number A such that 
V/ = — Ap s . The case A = is also admitted and the explanation of the minus 
sign will be given in the following. 

In this case, K = 1, and one may note that the perpendicular vector p 2 is 
precisely p s = VLi. Let us define / as 



1 Note that, only if K = 1, i.e., if the dimension of E is N — 1, there will be a vector perpendicular to 
the constrained space. For K > 1, the dimensionality of the vector space of the directions in which one 
'leaves' E will be also larger than 1. 




(B2) 
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It is clear that, requiring the gradient of / to be zero, one recovers the condition 
V/ = — Ap s , which is satisfied by the points solution of the constrained stationary 
points problem. If one also asks that the derivative of / with respect to A be zero, 
the constraint L\{x) = that defines S is obtained as well. 

This process illustrates the Lagrange multipliers method in this particular ex- 
ample. In the general case, described by eq. (jBljl and the paragraph above it, it 
can be proved that the points x which are stationary subject to the constraints 
imposed satisfy 



~ df(x) 
Vf(x) = and -^-^ = i = l,...,K, (B3) 
oXi 



where 



K 

f(x):=f(x) + J2\ i L l (x) . (B4) 
i=l 

Of course, if one follows this method, the parameters Aj (which are, in fact, the 
Lagrange multipliers) must also be determined and may be considered as part of 
the solution. 

Also, it is worth remarking here that any two pair of functions, f± and of R N 
whose restrictions to £ are equal (i.e., that satisfy = /bis) obviously represent 
the same constrained problem and they may be used indistinctly to construct 
the auxiliary function /. This fact allows us, after having constructed / from a 
particular /, to use the equations of the constraints to change / by another simpler 
function which is equal to / when restricted to S. This freedom is used to derive 
the Hartree and Hartree-Fock equations, in sees. [6] and respectively. 

The formal generalization of these ideas to functionals (see appendix A) is 
straightforward if the space R N is substituted by a functions space J 7 , the points 
x by functions, the functions /, Li and / by functionals and the requirement that 
the gradient of a function be zero by the requirement that the functional derivative 
of the analogous functional be zero. 

Finally, let us stress something that is rarely mentioned in the literature: There 
is another (older) method, apart from the Lagrange multipliers one, for solving a 
constrained optimization problem: simple substitution. I.e., if we can find a set of 
N — K independent adapted coordinates that parameterize E and we can write the 
score function / in terms of them, we would be automatically satisfying the con- 
straints. Actually, in practical cases, the method chosen is a suitable combination 
of the two; in such a way that, if substituting the constraints in / is difficult, the 
necessary Lagrange multipliers are introduced to force them, and vice versa. 

As a good example of this, the reader may want to check the derivation of 
the Hartree equations in sec. [6] (or the Hartree-Fock ones in sec. [7}. There, we 
start by proposing a particular form for the total wavefunction <3? in terms of the 
one-electron orbitals (pi (see eq. (I22p ) and we write the functional T (which is 
the expected value of the energy) in terms of that special $ (see eq. (i24|) ). In a 
second step, we impose the constraints that the one-particle orbitals be normalized 
({4>i\4>i) = 1, i = 1,...,N) and force them by means of N Lagrange multipliers 
Aj. Despite the different treatments, both conditions are constraints standing on 
the same footing. The only difference is not conceptual, but operative: for the first 
condition, it would be difficult to write it as a constraint; while, for the second one, 
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it would be difficult to define adapted coordinates in the subspace of normalized 
orbitals. So, in both cases, the easiest way for dealing with them is chosen. 
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